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We discuss the extraction of information from detected binary black hole (BBH) coalescence gravi- 
tational wave bursts, focusing in particular on the nonlinear merger phase of the coalescence, which 
occurs after the gradual inspiral of the bodies in the binary and before the ringdown of the system 
to its final Kerr black hole state. 

We report four principal results: (i) If numerical relativity simulations have not successfully produced 
theoretical template waveforms for the merger by the time that BBH waves are first detected by 
LIGO/VIRGO interferometers, or if they cannot produce a set of templates that completely covers 
the space of merger waveforms, then observers can use simple band-pass filters to study the merger 
waves. For BBHs of total mass < 40Mq which are detected via their inspiral waves, we estimate 
that the signal-to-noise ratio from band-pass filtering will typically be of order unity for initial and 
advanced LIGO interferometers. Thus, the merger waves should be just visible above the noise for 
typical events; rare, stronger events will be more visible, and thus more interesting, (ii) We use 
Bayesian statistics and the maximum likelihood framework to sketch out an optimized method for 
extracting the merger waveform from the detector output. The method is based on a "perpendicular 
projection" of the observed (noisy) signal onto an appropriate function space that incorporates all 
our (possibly sketchy) prior knowledge of the waveforms. We argue that the best type of "basis 
functions" to use to specify this function space is wavelets or wavelet-like functions, and we develop 
the method in some detail in the language of wavelets. In an Appendix, we sketch an extension 
of the method which allows one to reconstruct the two independent polarization components of 
the merger waves from the outputs of a network of several interferometers, (iii) We propose a 
computational strategy for numerical relativists to pursue, if they successfully produce computer 
codes for generating merger waveforms, but if running the codes is too expensive to permit an 
extensive survey of the merger parameter space. In this case, for LIGO/VIRGO data analysis 
purposes, it would be advantageous to do a very coarse survey of the parameter space aimed at 
exploring several qualitative issues and at determining the ranges of the several key parameters 
which we describe, (iv) If merger templates are available for data analysis, matched filtering can be 
used to make quantitative tests of general relativity in a highly dynamical and nonlinear regime, and 
to make measurements of the binary's parameters. These measurements and tests can be carried 
out with moderate accuracy by LIGO/VIRGO, and with extremely high accuracy by the proposed 
space-based interferometer LISA. Using information theory, we estimate the total number of bits 
of information obtainable from the merger waves (~ 10 to 60 bits for LIGO/VIRGO, up to ~ 200 
bits for LISA), and estimate how much information would be lost due to numerical errors in the 
templates or to sparseness in the template grid. We deduce an approximate rule-of-thumb for the 
required accuracy of merger templates and for their spacing. 



I. INTRODUCTION AND SUMMARY 

A. Gravitational waves from binary black hole 
systems 

With the kilometer-scale, ground-based interfer- 
ometric gravitational- wave observatories LIGO |Q, 
VIRGO @, and GEO600 f| expected to be on line and 
taking data within the next few years, and with the 
space-based interferometer LISA in the planning 

and development stage, much effort is currently going 
into understanding potential gravitational-wave sources 



and associated data analysis issues. One potentially very 
interesting and important class of source is the coales- 
cence of binary black holes (BBHs) where the two black 
holes have comparable masses. Such binaries with to- 
tal masses M in the range 10M Q < M < 1O 3 M could 
be detected by ground-based interferometers, and with 
10 5 M Q <M< 1O 8 M by LISA. 

The evolution of these systems, and the gravitational 
waves that they emit, can be roughly divided into three 
successive epochs: an adiabatic inspiral epoch, in which 
the evolution of BBH systems is driven by radiation re- 
action, and which terminates roughly at the last stable 
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circular orbit a violent, dynamical merger epoch; 

and a ringdown epoch in which the emitted gravitational 
waves are dominated by the I = m = 2 quasinormal 
mode radiation of the final Kerr black hole. Gravita- 
tional waves from the merger phase could be rich with 
information about relativistic gravity in a highly nonlin- 
ear, highly dynamical regime which is poorly understood 
today. 

Theoretical predictions of the gravitational wave- 
forms h+{t) and h x (t) produced in the three phases of 
BBH coalescences will be useful both for detecting the 
gravitational-wave signal, and for interpreting and mak- 
ing deductions from the observed waveforms, i.e., for ex- 
tracting information from the waves. 

For the inspiral phase, such theoretical waveforms or 
waveform templates have already been computed ana- 
lytically to post-2. 5-Newtonian order iPJlof. These tem- 
plates will be accurate enough for separations r > 12M 
that their errors will not significantly impede wave de- 
tection: for more details see, for example, Ref. (ll]] and 
Sec. |U below. The phase evolution of the inspiral waves 
between r ~ 12 M and r ~ 6M, where M is the to- 
tal mass of system and r the distance between the black 
holes in Schwarzschild coordinates, will not be accurately 
described by the post- Newtonian approximation |l2| . Al- 
ternative analytic and numerical approximation schemes 
are under development for modeling the coalescence and 
computing the waves in this "Intermediate binary black 
hole" (IBBH) regime H-||. For the purpose of this 
paper, we consider this IBBH regime to be part of the 
inspiral phase of the coalescence. 

Templates for the ringdown phase of the coalescence 
are obtained using perturbation theory on the back- 
ground of the final Kerr black hole ; these templates 
consist of exponentially damped sinusoids. 

In contrast to the situation for the inspiral and ring- 
down phases, there is at the present time very little the- 
oretical understanding of gravitational waves from the 
merger phase, and no merger templates exist at all. De- 
tailed understanding of the merger probably will come 
only from numerical relativity. One rather large effort to 
compute the dynamics of BBH mergers is the American 
Grand Challenge Alliance, an NSF funded collaboration 
of physicists and computer scientists at eight institutions 
similar efforts are underway elsewhere. Modeling 
BBH mergers is an extremely difficult task; the numeri- 
cal relativists who are writing codes for simulating BBH 
mergers are beset with many technical difficulties. 

Our theoretical understanding of BBH mergers could 
be in any one of several different states by the time 
the first BBH coalescences are detected: (i) No infor- 
mation: The supercomputer simulation codes have not 
yet been successfully implemented, thus no information 
about waves from BBH mergers is available, (ii) Informa- 
tion limited in principle: A small amount of information 
about the waves is available. This could arise if working 
supercomputer codes are available, but the codes cannot 
simulate fully general BBH mergers, but only those in 



some special class {e.g., vanishing initial spins, or equal 
mass black holes). Or, it could arise if the codes can 
simulate arbitrary mergers but technical difficulties pre- 
vent the extraction of accurate gravitational waveforms; 
in such cases one would know at least the duration of 
the merger waves, (iii) Information limited in practice: 
Fully general BBH mergers can be simulated and wave- 
forms can be extracted, but each run of these codes to 
produce a template is very expensive in terms of com- 
puter time and cost, and therefore only a small number 
of representative template shapes can be computed and 
stored. (The total number of template shapes required 
to cover the entire range of behaviors of BBH mergers 
is likely to be in the range of thousands to millions or 
more.) (iv) Full information: A complete set of theo- 
retical templates has been computed and is available for 
data analysis. This fourth possibility seems rather un- 
likely in the time frame of the first detections of BBH 
coalescences. 



B. Detecting the waves 

Depending on the system's mass, some BBH coales- 
cence events will be most easily detected by searching 
for the inspiral waves, others by searching for the ring- 
down waves, and others by searching for the merger waves 
themselves (depending on the systems mass). In paper 
I of this series |ll|], we analyzed the prospects for de- 
tecting BBH events using these three different types of 
searches, for initial and advanced LIGO interferometers 
and for LISA. We briefly review here some of the relevant 
aspects and conclusions of that analysis. 

Low-mass BBHs [M < 30M© for the first LIGO in- 
terferometers; (1 + z)M < 80M Q for the advanced LIGO 
interferometers; (l+z)M < 3x 10 6 Af Q for LISA, where z 
is the source's cosmological redshift] are best searched for 
via their inspiral waves. Such searches will use matched 
filtering with post-Newtonian templates. These low-mass 
binaries may be the most common type of detected BBH 
source. Moreover, they may well be the first detected 
source of gravitational waves and be detected before bi- 
nary neutron star inspirals, since the range of initial 
LIGO interferometers for BBHs with M < 5OM is 
~ 250 Mpc whereas binary neutron stars can be seen out 
to - 25 Mpc |jn],(n§. 

Higher mass BBH systems are best searched for via 
their ringdown waves or merger waves. A matched filter- 
ing search for ringdown waves will be possible as soon as 
data are available, since ringdown templates are simple 
to construct. 

A matched filtering search for merger waves could be 
performed if a complete set of merger templates were 
available. We estimated in Ref. [[tl| that the resulting 
event detection rate would be a factor of roughly 40 
higher than the event rate from inspiral and ringdown 
searches for a certain range of BBH masses (30Mq < 



2 



M < 200M n for initial LIGO interferometers, 100M Q < 



M 
3 x 
ever 





< 400M o 
1O 6 M < 



for advanced LIGO interferometers, and 
(1 + z)M < 3 x 10^M Q for LISA). How- 
it seems very un- 
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as mentioned above in Sec. 
likely that a complete bank of numerical templates will 
be available. If merger templates are not available, one 
can still search for the merger waves using simple band- 
pass filtering (i.e., using filters that throw away all sig- 
nal and noise except for that within some prescribed fre- 
quency band), or more effectively using techniques such 
as the noise-monitoring search method described in Refs. 
JTl| , pC| . The gain factor in event detection rate for noise- 
monitoring searches for merger waves, over inspiral and 
ringdown searches, will be roughly 4 to 10, depending 
on (among other things) whether or not one has firm 
information from representative supercomputer simula- 
tions about the possible durations and frequency band- 
widths of merger waveforms |2lJ . 

Once a BBH event has been detected, the location of 
the three different phases of the waves in the data stream 
will be known to a fair approximation. For many de- 
tected events, though, it will not be the case that all 
three phases will be detectable. For instance, typical low 
mass BBH events which are detected via their inspiral 
waves will have ringdown waves that are too weak to be 
detected; see Ref. Ill and Sec. lit. Likewise, very mas- 
sive systems which are detected via their ringdown waves 
might in some cases not yield a detectable inspiral signal. 



C. Extracting the waves' information: three 
scenarios 

In contrast to paper I [jfl|, where we focused on ex- 
pected signal strengths and search strategies for BBH 
events, in this paper we focus on measurements of the 
merger waveform itself: on reconstructing the waveform 
from the instrumental data stream, and on using the 
measured waveforms to learn about the BBH source and 
about the dynamics of very strong field general relativity. 
At present, because merger waveforms are so poorly un- 
derstood, it is hard to say how much one can learn about 
BBH systems from their merger waves. Both how well we 
can reconstruct BBH waveforms and how much we can 
learn from such reconstructions depend on the success of 
efforts to numerically simulate BBH mergers. 

In this subsection, as backgrou nd t o the discussion of 
the contents of this paper in Sec. ID below, we describe 



in general terms three possible different scenarios for data 
analysis of the merger waves: 

T he fi rst possibility [corresponding to situation (i) in 
Sec. I A | is that numerical computations might provide 
no input at all that can be used to aid gravitational- 
wave data analysis. In this case, with no templates to 
guide the interpretation of the measured waveform, it 
will not be possible to obtain any information about the 
BBH source or about strong-field general relativity from 



the merger waves. One's goal will simply be to measure 
as accurately as possible the merger waveform's shape. 
For this waveform shape measurement, observers should 
make use of all possible prior information obtainable from 
analyses of the inspiral and/or ringdown signals, if they 
are detectable. (For example, if the system is detected 
via its inspiral waves, then one will know that the merger 
waves lie immediately following the inspiral waves in the 
data stream, and must join smoothly onto the ringdown 
waves.) 

Second [situations (ii) and (hi) of Sec. [A], if one 



has only a few, representative supercomputer simulations 
and associated waveform templates at one's disposal, one 
might simply perform a qualitative comparison between 
the measured waveform and templates in order to de- 
duce qualitative information about the BBH source. For 
instance, simulations might demonstrate a strong corre- 
lation between the duration of the merger (in units of 
the total mass of the system) and the spins of the black 
holes in the binary. One might then be able to deduce 
some information about the black hole spins from the 
duration of the reconstructed merger waveform, without 
having to find a template that exactly matched the mea- 
sured waveform. In this second scenario, for the purpose 
of reconstructing a "best fit" merger waveform from the 
noisy data stream, one should use the prior information 
from the measured inspiral and/or ringdown waves, and 
in addition the prior information (for example the ex- 
pected range of frequencies) one has about the merger 
waveforms' behaviors from the representative supercom- 
puter simulations. 

The third scenario consists of performing matched fil- 
tering analyses of the data stream with merger templates 
in order to measure the parameters of the BBH binary 
and to test general relativity. This will certainly be fea- 
sible if one has a com plete set of merger templates [situ- 
ation (iv) of Sec. I A | . However, in some cases matched 
filtering parameter extr action may also be feasible in sit- 
uation (iii) of Sec. IA, where one has a working com- 
puter code for simulating BBH mergers but where each 
run of the code is so expensive in computer time and cost 
that it is not possible to calculate a complete set of tem- 
plates. In such a case, after the merger waves have been 
detected, it may be possible to perform several runs of 
the supercomputer code, concentrated in the appropriate 
small region of parameter space compatible with one's 
measurements from the inspiral and ringdown waves, in 
an effort to match the observed waveforms. In cither 
case (complete set of templates or templates produced as 
needed), a conclusive fit between a numerical waveform 
and the measured waveform would be a triumph for gen- 
eral relativity, testing the theory in an extremely strong 
field, fast motion regime with no approximations, and 
would provide an unequivocal signature of the existence 
of black holes. 

In this paper, as we now outline, we consider the re- 
quirements for and the implications of all three of these 
modes of data analysis. 
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D. Extracting the waves' information: our analyses, 
suggested tools, and results 

The four principal purposes of this paper are: (i) to 
review and discuss the useful information carried by all 
three phases of the waves and the prospects for its extrac- 
tion, both with and without templates; (ii) to suggest a 
data analysis method that can be used in the absence of 
templates to obtain from the noisy data stream a "best- 
fit" merger waveform shape; (iii) to provide input to nu- 
merical relativity simulations by highlighting the kinds of 
information that supercomputer simulations can provide, 
other than merger templates, that can aid BBH merger 
data analysis; and (iv) to provide input to numerical rel- 
ativity simulations by deriving some requirements that 
numerical templates must satisfy in order to be as useful 
as possible for data analysis purposes. We now turn to 
a detailed summary of our analyses and results in these 
four areas. 

We first consider the situation in which very little in- 
formation about the merger waveform is available to aid 
data analysis. The data analysis method that we suggest 
[item (ii) in the above paragraph] reduces in this case 
to band-pass filtering. In this case, observers will likely 
resort to simple band-pass filters to study the merger 
waves. The first question to address in this context is 
whether the merger signal is likely to even be visible; 
that is, whether the signal will stand out above the back- 
ground noise level in the band-pass filtered detector out- 
put. 

The merger signal will be visible if the band-pass fil- 
tering signal-to-noise ratio (SNR) is large compared to 
unity. In paper I of this series, we estimated the matched 
filtering SNRs that could be obtained from the merger 
signal if templates were available (c/. Figs. 4, 5, and 6 
of Ref. JnJ); and we estimated that the SNRs that can 
be achieved for the merger signal with band-pass filters 
will be roughly a factor of 5 smaller than the matched fil- 
tering SNRs. The resulting values of band-pass filtering 
SNR depend on the distance to the BBH. In Sec. IV we 



estimate the distance to typical BBHs with M < 20Af Q 
that have been detected via their inspiral signals by ini- 
tial LIGO interferometers, and we infer that the merger 
signal is likely to be marginally visible (band-pass filter- 
ing SNR ~ 1) for typical detected events. For advanced 
LIGO interferometers, we estimate that the merger signal 
is somewhat less likely to be visible (band-pass filtering 
SNR ~ 1/4). The reason for this somewhat counterin- 
tuitive result is that matched filtering is more efficient, 
relative to band-pass filtering, for advanced interferom- 
eters. Thus, only the somewhat rarer, stronger merger 
signals will be visible for advanced LIGO interferometers. 
For LISA, by contrast, we estimate that the band-pass fil- 
tering SNRs will typically be > 200 and thus the merger 
waves will easil y be visible. 

In Sec. IV A , for comparison, wc estimate the band- 
pass filtering SNRs of the last few cycles of inspiral waves 



(i.e., just before merger) and find them to be typically of 
order unity for low mass BBH events detected by ground- 
based interferometers. Thus, the last few cycles of the 
inpiral should be (just about) individually visible above 
the interferometer noise. 

When templates are not available, one's goal will be to 
reconstruct as well as possible the merger waveform from 
the noisy data stream. In Sec. |v] we use Bayesian statis- 
tics and the framework of maximum likelihood estimation 
to sketch out an optimized method for performing such 
a reconstruction in the absence of theoretical templates. 
The method is based on a "perpendicular projection" 
of the observed noisy signal onto an appropriate func- 
tion space that encodes all our (possibly sketchy) prior 
knowledge about the waveforms. We argue that the best 
type of "basis functions" to use to specify this function 
space are wavelets: functions which simultaneously allow 
localization in time and in frequency. We develop this 
reconstruction technique in detail using the language of 
wavelets. We show that the operation of "perpendicular 
projection" into the function space is a special case of 
Wiener optimal filtering. In Sec. VD and Appendix ^ 
we demonstrate mathematically the rather obvious result 
that the reconstructed signal will statistically be a good 
representation of the true signal (as measured by a corre- 
lation integral between the true signal and reconstructed 
signal) only in the regime where the band-pass filtering 
SNR is large. 

In Appendix [A|, we describe an extension of the method 
to a network of several gravitational wave detectors which 
allows one to reconstruct, from the outputs of all the de- 
tectors in the network, the two independent waveforms 
h + (t) and h x (t) of the merger waves. We also show that 
our method for a network is an extension and general- 
ization of a metho d pr evio usly suggested by Giirsel and 
Tinto |p2| . Sees. Al and |A 2] of Appendix [A] overlap 
somewhat with unpublished analyses by Sam Finn p3| . 
Finn uses similar mathematical techniques to analyze the 
use of multiple interferometers to measure a stochastic 
background of gravitational waves and to measure waves 
of well-understood form, applications which are rather 
different from the measurement of bursts of unknown 
form that we consider. 

Our waveform reconstruction algorithm comes in two 
versions: a simple version incorporating the above men- 
tioned "perpendicular projection", described in Sec. VB, 



and a more general and powerful version that allows 
one to _build in more prior information, described in 
If one's prior information consists only of 



Sec. VC 



knowledge about the signal's bandwidth, then the best- 
fit reconstructed waveform is just the band-pass filtered 
data stream. However, one can also build in as input to 
the method the expected duration of the signal, the fact 
that it must match up smoothly to the measured inspiral 
waveform, etc.; in such cases the reconstructed waveform 
differs from the band-pass filtered data stream. 

Qualitative information about BBH merger waveforms 
will thus be very useful as prior information for signal 
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reconstruction. Such information will also be useful as a 
basis for qualitative comparisons with the reconstructed 
waveforms in order to make qualitativ e de ductions about 
the BBH source, as outlined in Sec. I C above. Super- 
computer simulations should be able to provide such in- 
formation, in the case where these codes can success- 
fully simulate BBH mergers and produce templates, but 
where running the codes is too expensive to permit an 
extensive survey of the merger parameter space (i.e., too 
expensive to produce a complete set of templates). In 
this situation, a small number of representative simu- 
lations could still be extremely useful. In Sec. VI, we 
give examples of the types of information such super- 
computer simulations could provide (short of providing a 
complete set of merger templates): the range of numbers 
of cycles in the merger waveform and how this number 
depends on parameters such as the initial spins of the 
black holes; the (closely related) range of temporal dura- 
tions of merger waveforms, and how duration varies with 
parameters of the binary; the minimum and maximum 
frequencies of typical merger energy spectra; characteris- 
tics of the waveform's time/frequency behavior (whether 
it involves a monotonic chirp or not, and whether in some 
cases it can be characterized as a modulated carrier wave 
or not); and which quasinormal modes are typically ex- 
cited, and how strongly. 

We turn next to issues concerning the use of numer- 
ical templates in data analysis. In Sec. VII , we begin 
to examine matched filtering o f m erger waves with tem- 
plates. As mentioned in Sec. I C above, such matched 
filtering may be possible even if a complete set of merger 
templates does not exist: runs of merger template gener- 
ation codes can be performed as part of the data analysis 
of measured BBH signals in an effort to produce a tem- 
plate that matches the measured waveform; such eff orts 
may or may not be successful. We review in Sec. VII 
what one should be able to achieve with matched filter- 
ing: measurements of the binary's physical parameters 
(masses, vectorial spin angular momenta, etc.) which are 
independent of any such measurements from the inspiral 
and ringdown waves; and quantitative tests of general 
relativity in the most extreme of domains: highly non- 
linear, rapidly dynamical, highly non-spherical spacetime 
warpage. These measurements and tests will be possible 
with modest accuracy with LIGO /VIRGO, and with ex- 
tremely high accuracy with LISA (for which the merger 
matched filtering SNRs are typically > 10 4 ]Tl[|). 

In order for such measurements and tests to be as 
successful as possible, the numerically generated m erger 
templates must satisfy certain r equi rements. In Sec. VIII 
we derive a simple formula [Eq. ( |S.2| )] that numerical rela- 
tivists can use to ensure that the waveforms produced by 
their simula tions ar e sufficiently accurate for data anal- 
ysis. In Sec. VIII A we describe how this formula can be 



used to regulate the accuracy with which the numerical 
simulations are carried out. The formula is derived from 
the following requirements: first, any signal searches that 
use matched filtering with merger templates should suffer 



a fractional loss of event rate due to template inaccuracies 
of no more than 3%; and second, when using templates to 
fit for and measure the physical parameters of the BBH 
source (masses, spins etc.), the systematic errors due to 
template inaccuracies should always be smaller than the 
detector-noise induced statistical errors. The derivation 
of t he form ula from these two requirements is given in 



Sec. VIII B 



In Sec. IX, we address again the issue of template accu- 
racy requirements, and also the issue of the required spac- 
ing of templates in parameter space in the construction 
of a grid of templates, by using the mathematical ma- 
chinery of information theory. In information theory, a 
quantity called "information" (analogous to entropy) can 
be associated with any measurement process: it is sim- 
ply the base 2 logarithm of the number of distinguishable 
outcomes of the measurement |^ , ^5| . Equivalently, it is 
the number of bits required to store the knowledge gained 
from the measurement. We specialize the notions of in- 
formation theory to gravitational wave measurements, 
and define two different types of information: (i) the "to- 
tal" information /total which is the base 2 logarithm of 
the total number of distinguishable waveform shapes that 
the measurement could have produced; and (ii) a smaller 
"source" information /source, which is the base 2 loga- 
rithm of the total number of distinguishable waveform 
shapes that the measurement could have produced and 
that are generated by BBH mergers. This second mea- 
sure of information is equivalent to the base 2 logarithm 
of the total number of independent BBH sources that the 
measurement could have distinguished. We give pr ecise 
defini tions of these two notions of information [Eqs. ( |9.2| ) 
and ( |9.11 )] in Sec. IX. In Appendix [B], we derive sim- 
ple analytic approximations for the quantities /total and 
/source [Eqs. Q9.8| ) and (9.12)], expressing them in terms of 
the merger signal's matched filtering signal-to-noise ra- 
tio p, the number of independent real data points A/bms 
in the observed signal, and the number of parameters 



AC 



on which merger templates have a significant de- 



pendence. We estimate that the total information gain 
/total is typically of the order of ~ 10 to ~ 120 bits for 
LIGO/VIRGO, and can be up to - 400 bits for LISA; 
and that the source information gain / sourc e is typically 
of the order of 10 to 70 bits for LIGO/VIRGO, and can 
be up to ~ 200 bits for LISA. 



In Sec. [X C , we estimate the loss in information about 
, that would result from template 



the BBH source, 5I S , 



inaccuracies [Eq. ( 9.20 ) below]; this allows us to re-derive 
the criterion f or th e template accuracy requirements ob- 
tained in Sec. VIII . We also estimate the loss in informa- 
tion (5/gource t na t. would result from having insuffic iently 
closely spaced templates in a template grid [Eq. ( 9.24 ) 
below], and we deduce an approximate criterion for how 
closely templates must be spaced. 
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E. Organization of this paper 

The remainder of this paper is organized as follows: 
In Sec. ||, we define the notations and conventions that 
we will use throughout the paper. In Sec. pTTj , we re- 
view in moderate detail the information obtainable from 
the inspiral and ringdown phases of the waves for de- 
tected BBH events, which will be used as prior informa- 
tion when attempting to analyze the merger phase. 
In Sec. 



[V 



we di scuss t he visibility of BBH coalescence 
waveforms. In Sec. IV A , we first compute the band-pass 
filtering SNR for the last few cycles of the inspiral; this 
serves as background to the merger visibility analysis, 
and is relevant to the merger visibility itself: if the end 
of the inspiral is visible, then the beginning of the merger 
will most likely be visible as well. In Sec. IV B, we analyze 



the merger visibility. 

In Sec. [y| we present our method for optimally recon- 
structing the merger waveform from t he in terferometer 
output. We derive the method in Sec. VB , and in Ap- 
pendix ^ we present an extension of the method to a 
netw ork of several gravitational-wave detectors. In Sec. 



V C wc describe another extension of the method that 



allows one to incorporate prior information in a more ef- 
fective way. In Sec. VD we quantify the fidelity of the 



reconstructed waveform by defining a normalized correla- 
tion coefficient that describes how well the reconstructed 
wave correlates with the true waveform. We show in 
Appendix ^ that this coefficient will be close to 1 (i.e., 
that the reconstructed waveform will be close to the true 
waveform) when the signal's band-pass filtering SNR is 
> 1. 

In the remainder of the paper, we consider the situ- 
ation where supercomputer simulations are able to pro- 
vide some input to data analysis, either in the form of 
useful qualitative or semi-quantitative information about 
the merger, or in the form of templates. Sec. VI presents 
a list of the kinds of information that numerical rela- 
tivists may be able to provide, short of a definitive tem- 
plate set, that can be used to aid data analysis. Sec. 
VII discusses and describes the kinds of information that 



can be obtained from the gravitational wave data when 
merger te mplat es are available. 

In Sees. VIII and IX we present our derivations of cri- 
teria for determining whether templates are numerically 
accurate enough and clos ely spa ced enough to be used 
in data analysis. In Sec. VIII B we derive an accuracy 



criterion from the requirement that the loss in event de- 
tection rate due to template inaccuracies in a matched 
filtering signal search using merger templates be no more 
than 3%. We also obtain, in Sec. VIII B, approximately 
the same criterion from demanding that systematic errors 
in parameter extraction using merger waveforms be small 
compar ed to the detector-noise induced statistical errors. 
In Sec. [X C we rederive the accuracy criterion using the 
mathematical machinery of information theory. In this 
derivation, we require that the number of bits of informa- 



tion lost due to template inaccuracies be less than 1 . The 
relevant information theoretic concepts are presented in 
Sees. IX A and IX B; some of the technical calculations 



are relegated to Appendix |Bj. 

Finally, in Sec. [xj we summarize our main conclusions. 



II. NOTATIONS AND CONVENTIONS 

In this section we introduce some of the conventions 
and notations that will be used throughout the paper. 
We use geometrized units in which Newton's gravita- 
tional constant G and the speed of light c are unity. For 
any function of time a(t), we will use a tilde to repre- 
sent that function's Fourier transform, according to the 
convention 



Kf) = 



dte 2 ™ ft a(t). 



(2.1) 



The output strain amplitude s(t) of a gravitational wave 
detector can be written as 



9(t) = h(t)+n(t), 



(2.2) 



where h(t) is the gravitational wave signal and n(t) is the 
detector noise. Throughout this paper we will assume, 
for simplicity, that the noise is stationary and Gaussian. 
The statistical properties of the noise determine a natural 
inner product (. . . | . . .) on the vector space of waveforms 
h(t), given by 



(h 1 \h 2 ) =4 Re 



df 



hiifTHf) 
s h (f) 



(2.3) 



see, for example, Refs. |2^j27|. In Eq. (|2.3|), Shif) is the 
power spectral density of the strain noise n(t) [[28). The 
associated norm is given by 



\h\\ = V(h\h). 



(2.4) 



For any waveform h(t), the matched filtering signal-to- 
noise ratio is given by 



P 



= {h\h) = A f 
Jo 



df 



\Hf)\ 2 

ShW 



(2.5) 



On several occasions we shall be interested in finite 
stretches of data of length T say, represented in a discrete 
way as a vector of numbers instead of as a continuous 
function. If At is the sampling time, this vector is 



s=(s\ 



(2.6) 



where A/ bins = T j At, s j = s(i start +jAt), 0<j< M h in S , 
and i s tart is the starting time. The quantity A/bins is the 
number of independent real data points (number of bins) 
in the measured signal; it is denoted by Af in Appendices 
[A| and |c]. The gravitational wave signal h(t) and the 



G 



noise n(t) can similarly be represented in this way, so 
that s = h + n, as in Eq. (|2.2|) . We adopt the geometrical 
viewpoint of Dhurandhar and Schutz p9| , regarding s as 
an element of an abstract vector space V of dimension 
■A/bins, and the sample points s 3 as the components of s 
on a time domain basis {ei, . . . , ejV biM } of V: 



s J e.,-. 



(2.7) 



Taking a finite Fourier transform of the data stream can 
be regarded as a change of basis of V in which s remains 
fixed but its components change. Thus, a frequency do- 
main basis {dfc} of V is given by the finite Fourier trans- 
form 



d fe = ^ e i exp{27nj/c/A/'bins}, 



(2.8) 



where -{N hins - l)/2 < k < (A/bins - l)/2. The cor- 
responding frequencies — k/T run from — l/(2At) to 
1/{2M) |§. 

More generally, if we band-pass filter the data stream 
down to a frequency interval of length A/, and consider 
a stretch of band-pass filtered data of duration T, this 
stretch of data will have 



A/bins = 2TA/ 



(2.9) 



independent real data points. In this case also we regard 
the set of all such stretches of data as an abstract linear 
space V of dimension A/bins- 

On an arbitrary basis of V, we define the matrices 
and S y by 



(n'n 3 ) = 



(2.10) 



and 



(2.11) 

i.e., the matrices T and X are inverses of each other. In 



Eq. (2.1C) the angle brackets mean expected value. On 
the time domain basis {ei, . . . , ejv bins }, we have 



Z jk = C n (tj - t k ), 

where t 3 = t Btart + jAt, and C„(r) = 
noise correlation function given by 



(2.12) 

(n(t)n(t + r)) is the 



C„(r) 



d/cos[27r/r]^(/). 



(2.13) 



We define an inner product on the space V by 

(h!|h 2 ) = T l0 h\h{ (2.14) 



This is essentially a discrete version of the inner product 
(2.3) which characterizes the detector noise: the two in- 
ner products coincide in the limit of small sampling times 



At, and for waveforms which vanish outside of the time 
interval of length T ||. 

Throughout this paper we shall use interchangeably 
the notations h(t) and h for a gravitational waveform. 
We shall also for the most part not need to distinguish 
between the inner products (2.3) and ( 2.14| ). Some gener- 
alizations of these notations and definitions to a network 
of several detectors are used in Appendix |a[ 

For a given detector output s = h + n, we define 



P(s) 2 



(8 I 



(2.15) 



which is the inner product or integral of the detector 
output with itself. We will call p(s) the magni tude of the 
stretch of data s. From Eqs. ( 2.10 ) and ( 2.14 ) it follows 
that 



(2.16) 



(p(s) 2 ) = P 2 +AT hi] 



where p 2 is the matched filtering SNR squared (|2.5| ) of 
the signal h, and that 



v /([Ap(s)2]2)^ v /V + 2AA b 



(2.17) 



where Ap(s) 2 = p(s) 2 — (p(s) 2 ). Thus, the magnitude 
p(s) is approximately the same as the usual SNR p in the 
limit p 3> V A/bins (large signal-to-noise squared per fre- 
quency bin), but is much larger than p when p -C \/ A/bins • 
The quantity p(s) will occur in our in our information 
theory calculations in Sec. [X and Appendix pi 



The space V equipped with the inner product (2.14) 
forms a Euclidean vector space. We will also be con- 
cerned with sets of gravitational waveforms h(0) [equiv- 
alently, hit; 6)] that depend on a finite number n p of pa- 
rameters 9 = (6 1 , . . . , 6 n "). For example, inspiral gravi- 
tational waveforms form a set of this type, where are 
the parameters describing the binary source. We will 
denote by S the manifold of signals h(0), which is a sub- 
manifold of dimension n p of the vector space V. We will 
adopt the convention that Roman indices i, j, k, . . .will 
run from 1 to Abms, an d that v % will denote some vector 
in the space V. Greek indices a, (3, 7 will run from 1 
to n p , and a vector v a will denote a vector field on the 
manifold S. The inner product (2.14) induces a natural 
Riemannian metric on the manifold S given by 



ds 2 



dh 



(218) 



We shall denote this metric by T a p and its inverse by 
S Q/3 , relying on the index alpha bet t o disti nguish these 
quantities from the quantities ( 2.10 ) and (|2.11 ). For 



more details on this geometric picture, see, for example, 
Ref. @. 

We shall use the word detector to refer to either a sin- 
gle interferometer or a resonant mass antenna, and the 
phrase detector network to refer to a collection of de- 
tectors operated in tandem. Note that this terminology 
differs from that adopted in, for example, Ref. p6| , where 
a detector network is called simply a detector. 
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Finally, we will use bold faced vectors like a to denote 
either vectors in three dimensional space, or vectors in 
the A/bin S -dimensional space V. In Appendix |A|, we will 
use arrowed vectors (e.g., a) to denote elements of the 
linear space of the output of a network of gravitational 
wave detectors. 



III. INFORMATION FROM THE INSPIRAL AND 
RINGDOWN PHASES 

Different types of information will be obtainable from 
the three different phases of the gravitational wave signal. 
If the inspiral and ringdown phases are strong enough 
to be measurable, they will be easier to analyze than 
the merger phase, and the information they yield will be 
used as "prior information" in attempting to analyze the 
merger phase. For instance, from the inspiral portion of 
the signal it will be possible to measure the masses of 
the binary's black holes to some accuracy (as we discuss 
below). Those measured masses will then be an input 
to data analysis of the merger waves, since they strongly 
constrain the possible values of template parameters that 
need to be examined when fitting a theoretical waveform 
to the merger signal. In this section, we review the prior 
information that will likely be available from measure- 
ments of the inspiral and the ringdown in typical cases. 

Let us focus first on solar mass coalescences [(1 + 
z)M < 5OM say] measured by ground based interferom- 
eters, for which most of the prior information will come 
from the inspiral waveforms. The analysis of the inspi- 
ral waveforms will take place in two phases. The first 
phase will consist of filtering the data streams of each 
detector separately using "search templates" in order to 
detect the inspiral (3^]. These search templates will de- 
pend on 2 or possibly 3 parameters. Roughly 10 4 to 10 5 
distinct template shapes will be required for initial LIGO 
interferometers, and roughly 10 6 to 10 7 template shapes 
for advanced LIGO interferometers [[^-[H]. (Note that 
these numbers assume that the search is for generic in- 
spiraling binaries, not simply black hole binaries. If the 
search were restricted to BBH systems only, these num- 
bers would be greatly reduced: assuming that the small- 
est BBH systems consist of a pair of 2M Q binaries, the 
number of templates for initial LIGO interferometers is 
roughly 10 3 , and for advanced interferometers roughly 
10 5 .) The second phase will consist of combining the 
outputs of all the detectors together and using the most 
accurate templates available ( "extraction templates" ) to 
analyze the signal and extract the best-fit parameter val- 
ues. Such extraction templates will presumably be pro- 
vided by post-Newtonian calculations, perhaps improved 
by the judicious use of Pade approximants [Q, and per- 
haps supplemented by IBBH calculations in the IBBH 



regime 6M < r < 12M (cf. the discussion in Sec. I A 
above). In this second phase there will be 15 indepen- 
dent parameters to fit for. These parameters are the 
masses mi and mi and initial spins Si and S2 of the two 



black holes, the luminosity distance D to the binary, the 
direction of the orbital angular momentum L = L/|L|, 
the direction n from the binary to the Earth, and the 
arrival time t c and orbital phase <p c at some fiducial fre- 
quency. (The dependence of the templates on several of 
these 15 parameters, such as the luminosity distance, will 
be trivial and will not need to be computed numerically.) 

As an example, consider a binary with two non- 
spinning 10 Mq black holes at a distance of 200 Mpc. 
The inspiral SNR for this system is ~ 100 for advanced 
LIGO interferometers |ll]]. In this optimistic case, the 
information obtained from the inspiral waveform will be 
roughly as follows p(i| j: The distance to the system will 
be known to < 2%, the masses will be known to ~ 40% 
(although the chirp mass M = ^ 3/5 M 2 / 5 will likely be 
known to an accuracy of < 0.1%), the arrival time to 
~ 0.1 ms, the position on the sky to less than one square 
degree, and the angles defining L and <fi c to < 10°. Also, 
some information will be obtained about two particular 
combinations of the spins Si and S 2 (see Refs. 
for details). As a second example, consider a binary of 
two 15 Mq black holes at z = 1, for which the inspiral 
SNR for advanced interferometers is ~ 7 [[llj . For such a 
binary the accuracies are several times worse. The lumi- 
nosity distance is measured to ~ 20%, for example, and 
although the chirp mass is measured to < 1%, the indi- 
vidual masses are only constrained to lie in the ranges 
3 M m < to, < 15 M m and 15 M m < 



mi 



< 



100 Mq g(J. 



Turn, now, to the information obtainable from inspiral 
signals for the space-based LISA interferometer. Equa- 
tion (A6) of Ref. jll| shows that the time Ti nsp which the 
gravitational wave signal spends in the interferometer's 
bandwidth during the inspiral before merger is approxi- 
mately 



0.4 yr 



(l+z)M 



10 6 M P 







-5/3 



1 - 



(l + z)M 
4 x 10 7 Mq 



8/3' 



(3.1) 



Signal-to-noise ratios from such inspirals (or from the 
last year of inspiral if Ti nsp > 1 yr) will be > 100 for 
all events with cosmological redshift z < 10 and with 
10 4 M o < (1 + z)M < 5 x 10 7 M o ; see Fig. 6 of Ref. 
jnj. Thus, detailed information about the binary's pa- 
rameters should be available for analyzing merger signals 
detected by LISA @. (For some LISA BBH sources, 
most of the inspiral SNR will come from the IBBH regime 
6M < r < 12M discussed in the Introduction. For such 
sources, accurate IBBH templates will likely be needed 
to extract all the available inspiral information.) 

In some cases with LIGO/VIRGO, and in many cases 
with LISA, it will also be possible to analyze the ring- 
down waveform using optimal filtering to extract the 
ringdown frequency and damping time ]3l| , f43| ] . These 
measurements will yield the mass M and spin parameter 
a of the final black hole. The accuracy of such measure- 
ments will be approximately given by fl3l|,[43"| 
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AM 2(1 



\9/20 



M (S/N) 

ringdown 

6(1 -a) 106 



Aa : 



(S/N) t 



ingdown 



(3.2) 



where (S'/7V) r ingdown is the matched filtering SNR for the 
ringdown signal. It should also be possible to measure the 
time at which the ringdown starts to within an accuracy 
^ l//qnr- For low mass coalescences (M < 50M Q ), such 
measurements will only be possible for the very strongest 
detected events: the ringdown SNR will be > 1 only for 
the strongest ~ 1% of detected events for initial and ad- 
vanced LIGO interferometers [Q . For larger mass BBH 
coalescences, however, the ringdown SNR will be larger, 
as can be seen from Figs. 4 and 5 of Ref. fTl] , and ring- 
down measurements will be feasible for a reasonable frac- 
tion of detected signals. For LISA, Fig. 6 of Ref. Jll]] 
shows that most detected merger events will be accom- 
panied by easily detectable ringdown signals with SNR 
values > 100. Thus, accurate values of M and a should 
be available as prior information when analyzing merger 
signals detected by LISA. 

For the strongest detected signals, it may also be pos- 
sible to measure the complex amplitudes of some of the 
quasinormal modes in the waveform other than the dom- 
inant I = to = 2 mode. These higher order quasinor- 
mal ringing (QNR) modes will not be as long lived as 
the I = 777 = 2 mode, but they may nevertheless be de- 
tectable. The amplitudes and phases of such modes will 
constitute very useful information if they are measurable, 
since their values should be predicted by the supercom- 
puter simulations as functions of the binary's parameters 
at the start of the merger phase. The supercomputer 
simulations will have passed an important test if the mea- 
sured mode amplitude values are consistent with known 
information about the initial conditions. 



noisy detector output a best-guess estimate of the merger 
waveform h(t) p5 |. If a small number of representative 
supercomputer templates are available, it may then be 
possible to interpret the measured waveform and obtain 
qualitative information about the BBH source. One very 
simple procedure that could be used to obtain an esti- 
mate of the waveform shape is simply to band-pass filter 
the data stream according to our prior prejudice about 
the frequency band of the merger waves (based on es- 
timates of the merger signal bandwidth [pi] , hopefully 
supplemented by information from representative super- 
computer simulations and from inspiral/ringdown mea- 
surements) |46|. However, after such band-pass filtering, 
the merger signal may be dominated by detector noise 
and may not even be visible. (Signals that are visible in 
the noise will clearly be easier to reconstruct from the 
noisy d ata stream; we demonstrate this mathematically 
in Sec. VD below). 

In this section we explore this issue of merger waveform 
visibility, by which we mean whether or not the signal 
stands out above the noise after band-pass filtering. A 
signal will be visible if the band-pass filtering SNR is 
large compared to unity; see, for example, the discussion 
in Ref. H . We use the results of Ref. Q to estimate 
band-pass filtering SNRs, first fo r the inspiral waves near 
the end of the inspir al in Sec. [V A , and then for the 
merger waves in Sec. 



[VB 



The analysis of the inspiral 
waves is useful as background for the merger visibility 
calculation, and is also indicative of the visibility of the 
early merger waves (if the endpoint of inspiral is visible 
with band-pass filters, than one would expect that by 
continuity the beginning of the merger should be visible 
as well). 



A. Visibility of inspiral waveform 



IV. ANALYSIS OF THE MERGER WAVES 
WITHOUT TEMPLATES— VISIBILITY OF 
MERGER SIGNAL AFTER BAND-PASS 
FILTERING 

Turn now to the data analysis of the merger waves, 
focusing on the case in which matched filtering cannot 
be used. This situation will arise if supercomputer sim- 
ulations are unable to produce merger templates, or if 
they have only produced a small sampling of the total 
function space S of merger waveforms when BBH signals 
are detected. Such a sampling should provide valuable 
qualitative information about the merger waveforms (as 
we discuss in Sec. VI below), but would be too sparse to 
be u sed as a bank of optimal filters. (As mentioned in 
Sec. IC above, it may be possible to perform matched 
filtering in the absence of a complete set of templates, 
but this is not guaranteed). 

In the absence of a complete set of theoretical tem- 
plates, one's first aim will be to reconstruct from the 



We focus on BBH events which have been detected via 
their inspiral waves using matched filtering. Since the 
event has been detected, the inspiral matched filtering 
SNR must be > 6 [Q; however, it does not follow that 
the inspiral signal is visible in the data stream without 
matched filtering. (In fact, for neutron star-neutron star 
binaries the reverse is usually the case: the amplitude of 
the signal is rather less than the noise, and so matched 
filtering is very necessary to detect the waves.) We now 
estimate the degree of visibility of the last few cycles of 
the inspiral waveform for BBH coalescences. 

The dominant harmonic of the inspiral waveform can 
be written as 



h(t) 



,(t) cos[$(t)], 



(4.1) 



where the amplitude h amp (t) and instantaneous fre- 
quency f(t) [given by 2nf(t) = d$/dt] are slowly evolv- 
ing. For such waveforms, the SNR squared obtained us- 
ing band-pass filtering is approximately given by the SNR 







squared per cycle obtained from matched filtering [cf. Eq. 
(2.9) of Ref. 0]: 



band— pass 



optimal filter, per cycle 
1 2 



^amp 



[*(/)] 



hn(f) 



(4.2) 



In Eq. ( |4.2j ), an rms average over source orientations 
has been performed, t(f) denotes the time at which 
the instantaneous frequency has value /, and h n (f) = 
y/5fShffj. Note that the band-pass filtering SNR @ 
is evaluated at a specific frequency, whereas typically 
when one discusses matched filtering SNRs, an integral 
over a large frequency band has been performed. Next, 
we insert the value of h amp [t(f)) 2 for the leading-order 
approximation to the inspiral waves, which can be ob- 
tained from, for example, Eq. (3.20) of Ref. pH, and 
obtain 



band- 



64^ 4 / 3 M 10 / 3 (l + z) 10 / 3 / 4 / 3 
5D(z) 2 h n (f) 2 



(4.3) 



Here M. = /i, 3 / 5 M 2 / 5 is the chirp mass, z is the binary's 
cosmological redshift and D(z) is the binary's luminosity 
distance. 

In Eq. (4.1) of Ref. M] we introduced an analytic for- 
mula for a detector's noise spectrum Sh(f), which, by 
specialization of its parameters, could describe to a good 
approximation either an initial LIGO interferometer, an 
advanced LIGO interferometer, or a space-based LIS A in - 
terferometer. We now insert that formula into Eq. (4.2), 
and specialize to the frequency 



/ = fu 



In 



(l + z)M : 



(4.4) 



where 7 m = 0.02. The frequency /merge is approximately 
the location of the transition from inspiral to merger, as 
estimated in Ref. |ll[] . We thus obtain for the band-pass 
filtering SNR 



band — pass 



47r 4 / 3 M 5 (l + z) 5 7 - 5/3 a 3 / 3 
SD(z) 2 h 2 m 



(4.5) 



where a, h m and f m are the parameters used in Ref. 
[ I to describe the interferometer noise curve. Equation 
( ^.5|) is valid only when the redshifted mass (1 + z)M of 
the binary is smaller than 7 m /a/ m . 

For initial LIGO interferometers, appropriate values of 
the parameters h rn , f m and a are given in E q. (4.2) of 
Ref. [O. Inserting these values into Eq. (4.5) gives 



1.1 



band — pass 



200 Mpc 



(l + z)M 

20 Mr 



5/2 



(4.6) 



which is valid for (1 + z)M < 18M . Now, the SNR 
obtained by matched filtering the inspiral signal (i.e., by 
correlating the inspiral data with an inspiral template 
over the full bandwidth of the signal) is approximately 



2.6 



optimal 



200 Mpc 



D(z) 



(1 + z)M 



20 M, 



o 



5/6 



(4.7) 



Also the quantity (4.7) must be > 6 |3j|], because, by as- 
sumption, the inspiral has in fact been detected. By elim - 
inati ng t he luminosity distance D(z) between Eqs. (4.6) 
and (4/7) we find that the band-pass filtering SNR for 
the last inspiral cycles of detected binaries satisfies 



> 2.5 



band — pass 



{l + z)M 



-i 5/3 



20 Mp 



© 



(4.8) 



Therefore, the last few cycles of the inspiral should be 
individually visible above the noise for BBH events with 
5M < M < 2OM detected by initial LIGO interferom- 
eters. 

We now repeat the above calculation with the values 
of h m , /,„, and a appropriate for advanced LIGO inter- 
ferometers, which are given in Eq. (4.3) of Ref. |ll|]. The 
band-pass filtering SNR for advanced interferometers is 



1.6 



band— pass 



lGpc 
~D{z) 



(l + z)M 



1 5/2 



20 M P 



© 



(4.9) 



and the SNR obtained by matched filtering the inspiral 
signal is 



-) 

/ optimal 



16 



lGpc 



(l + z)M 



20 Mr. 







5/6 



(4.10) 



for (1 + z)M < 37M Q. So, with the assumption that 
(S/N) optimal Z 6, we find 



> 0.6 



band— pass 



(l + z)M 



20 M, 



© 



5/3 



(4.11) 



for (l + z)M < 37M . Therefore, for BBH inspirals with 
(1 + z)M < 37M detected by advanced LIGO interfer- 
ometers, the last few cycles of the inspiral will be just 
barely individually visible above the noise, depending on 
the binary's total mass M. The last few cycles of the 
inspiral will also be visible for lar ger mass BBH systems, 
as can be seen by combining Eq. (4.2) above with Figs. 4 
and 5 of Ref. 0. 

For LISA, Eq. flOj) combined with Eq. (4.3) of Ref. 
yields 



180 



band — pass 



lGpc 



D(z) 



(l + z)M 



-i 5/2 



10 6 Mp 



(4.12) 



< 



for (1 + z)M < 10 5 M , with larger values for 10 5 M Q 
(1 + z)M < 3 x 10 7 M Q . Therefore individual cycles of 
the inspiral waveform should be clearly visible for LISA. 
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B. Visibility of merger waveform 

Consider now the merger waveform itself. This will be 
visible if the SNR from band-pass filtering of the merger 
signal is large compared to unity. In Ref. [jllf we showed 
that 



1 



band— pass, merger 



optimal, merger 



(4.13) 



where A/bins = 2TA/; T and Af are the expected dura- 
tion and bandwidth of the merger signal. We also esti- 
mated [Eq. (3.32) of Ref. Q] that for the merger waves. 



V A/bins ~ 5, 



(4.14) 



although there is large uncertainty in this estimate and 
■A/bins will vary from event to event. Combining Eqs . (5.4 ) 
of Ref. p] for initial LIGO interferometers, Eq. fl4.14| ), 
and the threshold for detection |34| 



S 



. ) >6 

/ optimal, inspiral 



yields 



> 0.8 



band — pass, merger 



{l + z)M 



20M P 



5/3 



(4.15) 



(4.16) 



for (1 + z)M <18M Q . For advanced LIGO interfer- 
ometers, Eq. J4.14|) together with Eq. (5.5) of Ref. Q 
similarly yield 



> 0.2 



band — pass, merger 



(l + z)M 



-i 5/3 



20M P 



(4.17) 



for (1 + z)M < 37 M(t,. Note that, contrary to one's 



intuition, the value ( 4.17) fo r advanced interferometers 
is lower than the value (4.16) for initial interferometers. 
This is because the advanced interferometers can detect 
inspirals with lower band-pass filtering SNRs than the 
initial interferometers, due to the larger number of cy- 
cles of the inspiral signal in the advanced interferome- 
ter's bandwidth. Matched filtering is extremely efficient 
at detecting inspiral signals, and it is more so for ad- 
vanced interferometers than for initial interferometers. 
The weaker the signals that are detectable by matched 
filtering, the less visible the merger waveform will be after 
bandpass filtering. 

The SNR values (|I|) and fl4.17| ) indicate that for 
typical inspiral-detected BBH systems with M < 2OM 
(initial interferometers) or M < 40M© (advanced inter- 
ferometers), the merger signal will not be easily visible in 
the noise, and that only the somewhat rarer, closer events 
will have easily visible merger signals. This conclusion is 
somewhat tentative because of the uncertainty in the es- 
timates of A/bins and of the energy spectra discussed in 



Ref. [[LT| . Also the visibility of the merger waveform will 
probably vary considerably from event to event. 

This conclusion only applies to low mass BBH systems 
which arc detected via their inspiral waves. For higher 
mass systems which are detected directly via their merger 
and/or ringdown waves, the merger signal should be visi- 
ble above the noise after appropriate band-pass filtering. 
Moreover, most merger events detected by LISA will have 
band-pass filtering SNRs ^> 1 , as can be seen from Fig. 6 
of Ref. pH , and thus should be easily visible. 

Our crude visibility argument thus suggests that the 
prospects for accurately recovering the merger waveform 
are good only for the stronger detected merger signals. 
This visibility analysis also illustrates the importance of 
theoretical template waveforms: the SNRs that can be 
achieved without them will often be mediocre at best. 
Templates for the merger will be able to boost measured 
SNRs by a factor V A/bins ~ 5. Of course, we need to 
go beyond this simple analysis and try to determine the 
optimal method of reconstructing the shape of the merger 
waveform from the noisy data; we propose one method 
in the following section. 



V. ANALYSIS OF THE MERGER WAVES 
WITHOUT TEMPLATES — A METHOD OF 
EXTRACTING A BEST-GUESS MERGER 
WAVEFORM FROM THE NOISY DATA STREAM 

A. Overview 

In the absence of a complete set of theoretical tem- 
plates we would like to reconstruct from the noisy de- 
tector data stream a best-guess estimate of the merger 
waveform h(t). In this section, we suggest and describe 
a method, based on the technique of maximum likeli- 
hood estimation |^,^] , for performing such a waveform 
reconstruction. 

A method for estimating the merger waveform shape 
h(t) should use all available prior knowledge about the 
waveform. We will hopefully know from representative 
supercomputer simulations and perhaps from the mea- 
sured inspiral/ringdown signals the following: (i) the ap- 
proximate starting time of the merger; (ii) the fact that 
it starts off strongly (smoothly joining on to the inspiral 
waveform) and eventually dies away in quasinormal ring- 
ing; and (iii) the approximate bandwidth and duration 
of the signal. For those signals for which both the inspi- 
ral and the ringdown are strong enough to be detectable 
with optimal filtering, the duration of the merger portion 
of the waveform will be fairly well known, as will the fre- 
quency /qnr of the ringdown signal onto which the merger 
waveform must smoothly join. The technique which we 
describe in this section encodes such prior information 
and makes use of it in reconstructing the best-guess esti- 
mate of the waveform. 
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We shall describe this method in the context of a sin- 
gle detector or interferometer. However, in a few years 
there will be in operation a network of several detectors 
(both interferometers and resonant mass antennae) 
and from the combined outputs of these several detectors 
one would like to reconstruct the two independent polar- 
ization components h+(t) and h x (t) of the gravitational 
waves from the merger. In Appendix ^ we show how to 
extend the waveform estimation method discussed in this 
section to an arbitrary number of detectors, which yields 
a method of reconstructing the two waveforms h + (t) and 

hy\t). 

The issue of reconstructing the waveforms h+(t) and 
hy (t) was previously addressed by Giirsel and Tinto p2| , 
in the context of a network of three interferometers and 
for arbitrary bursts of gravitational waves. Giirsel and 
Tinto suggest a method of extracting, from the outputs 
of all the interferometers (i) the direction to the source, 
and (ii) the two gravitational waveforms. For many BBH 
mergers, the direction to the source will have already 
been determined to fairly good accuracy from the inspiral 
waveform [E7], and so the Giirsel-Tinto filtering method 
is not directly applicable. However, they do suggest in 
passing a method for extracting the waveforms h + (t) and 
h x (t) when the direction to the source is given. In Ap- 
pendix ^ we show that our filtering method (as extended 
to a network of interferometers) is an extension and gen- 
eralization of the Giirsel-Tinto algorithm. 

The filtering methods which we consider are based 
on the theory of maximum likelihood estimation |^8|,^9| . 
The use of maximum likelihood estimators has been dis- 
cussed extensively by many authors in the context of 
gravitational waves of a known functional form, depend- 
ing only on a few parameters In this 
section we consider their application to gravitational 
wave bursts of largely unknown shape. The resulting 
data analysis methods which we derive are closely re- 
lated mathematically to the methods discussed previ- 
ously [^6 27 3^^,Q i but are considerably different in 
operational terms and in implementation. 



B. Derivation of data analysis method 

We now turn to our derivation of the best-guess wave- 
form estimator using maximum likelihood estimation. 
Suppose that our prior information about the merger 
waves includes the information that they lie inside some 
time interval of duration T, and inside some frequency in- 
terval of length Af. We define Af hins = 2TA/, cf. Sec. [j] 
above. We also suppose that we have a stretch of data 
to analyze of duration T' > T and with sampling time 
At < 1/(2A/). These data lie in a linear space V of 
dimension 



AC S = 2T'/At 



(5.1) 



A/bins is the number of independent real data points in 
that subset of the data which we expect to contain the 
merger signal. Note that these definitions constitute a 
modification/extension of the conventions introduced in 
Sec. [n] above, where the dimension of the space V was 
denoted by A/bins- We will use, unmodified, the other 
conventions of Sec. ||: thus, the detector output s is given 
by s = h + n, where h the gravitational-wave signal and 
n the detector noise, and the vectors s, h and n are all 
elements of the vector space V of dimension A" bins . 

In our analysis below, we will allow the basis of the vec- 
tor space V to be arbitrary. Thus, rii (for example) will 
denote the components of the noise on this arbitrary ba- 
sis. However, we will occasionally specialize to the time- 
domain and frequency-domain bases discussed in Sec. |f] 
above. We will also consider wavelet bases of V. Wavelet 
bases can be regarded as any set of functions «iq (t) such 
that Wij (t) is approximately localized in time at the time 
U = t s taxt + {i/ n T)T' , and approximately localized in fre- 
quency at the frequency fj — (J/np)(At) . The index i 
runs from 1 to rpp and j from — (rip — l)/2 to (tip — l)/2. 
Clearly the number of frequency bins np and the number 
of time bins ut must satisfy utUf = A^ ins , but other- 
wise they can be arbitrary; typically Ut ~ np ~ J A/^~ . 
Also, the functions usually all have the same shape, 
so that 



viij(t) oc tp[fj(t-U)] : 



(5.2) 



for some function ip. For our considerations here, the 
shape of tp is not of critical importance. Also, wavelet 
bases are often overcomplete; the bases we discuss below 
are to be considered simply complete. So if the full func- 
tion space of some family of wavelets is W, we restrict 
ourselves to some complete subset W of that space. The 
advantage of wavelet bases is that they they simultane- 
ously encode frequency domain and time domain infor- 
mation. 

Let p (0) (h) be the probability distribution (PDF) that 
summarizes our prior information about the gravitational 
waveform. A standard Bayesian analysis shows that the 
PDF of h given the measured data stream s is j26|,E5ll 



p(h\s) = £p<°>(h)exp [-ry(tf - - s j )/2] , 



(5.3) 



where the matrix Fy is defined in Eq. (2.11) and K, is a 



which is strictly larger than Abi 
number of independent real data points in the data, and 



Thus, A/" bins is the 



normalization constant l26f| . In principle this PDF gives 
complete information about the measurement. Maximiz- 
ing the PDF will yield the maximum likelihood estimator 
for the merger waveform h. This estimator will be some 
function h = h(s), which in general will be a non-linear 
function. The effectiveness of the resulting estimator of 
the waveform will depend on how much prior informa- 
tion concerning the waveform shape can be encoded in 
the choice of prior PDF p^> . 

One of the simplest possibilities is to take p(°) to be 
concentrated on some linear subspace U of the space V, 
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and to be approximately constant inside this subspace. 
A multivariate Gaussian with widths very small in some 
directions and very broad in others would accomplish 
this to a good approximation. For such choices of prior 
PDF p(°\ the resulting maximum likelihood es tima tor 
[the function h = h(s) that maximizes the PDF ([Tjj)] is 
simply the perpendicular projection Pu of s into U: 



Ibcst-fit 



(s) = Pu(s), 



where 



Puis) 



nu 

E 

»,.?■= i 



( u i i s ) 



(5.4) 



(5.5) 



Here, Ui, . . . , u nu is an arbitrary basis of U, nu is the 



dimension of U, 



ijk = Sj and Ujk — 
We remark that the method of filtering (5.4) is a spe- 
cial case of Wiener optimal filtering: it is equivalent to 
optimal filtering with templates that are constructed by 
taking linear combinations of the basis functions u^. (The 
equivalence between maximum likelihood estimation and 
Wiener optimal filtering in more general contexts has 
been shown by Echeverria |5lj].) To show that our filter- 
ing method is a form of Wiener optimal filtering, define 
a family of template waveforms that depends on param- 
eters ai, . . . , a nu by 



(Uj lUfc). 



h(t; cij) 



nu 

E 



djUj(t), 



(5.6) 



where Uj (t) are the functions of time corresponding to the 
basis elements Uj of U. If s(t) is the measured detector 
output, define for any function h(t) 



n ^(h|h) 



This is the SNR for the template h(t) with the data 
stream s. The best-fit signal given by the optimal fil- 
tering method is the template which maximizes the SNR 
(5.7), i.e., the template h(t;a,j) such that 



S_ 
N 



[h(t;aj)] 



S_ 

N 



[h(t;aj)] 



(5.8) 



However, it is easy to show from Eqs. ( 5.5)— ( 5.7) that 



(5.9) 



Pu(s) = h(t;aj). 



Thus, computing the perpendicular projection ( |5.5| ) of 
s into U is equivalent to Wiener optimal filtering with 
the family of templates (5.6). Fro m an operational point 
of view, the method of filtering (5.5) is quite different 
to the normal implementation of optima l filt ering, which 
is carried out by calculating the SNR (5.7) for va rious 
parameter values, but the final best-fit signals (5J3) are 
identical. [Of course, Wiener optimal filtering is normally 



only carried out when the dependence of the waveform 
h(t;a,j) on the parameters aj is complicated and non- 
linear, as when searching for inspiral waves where the 
parameters represent astrophysical characteristics of the 
binary system.] 



To summarize, the maximum likelihood estimator (5.4) 
gives a general procedure for specifying a filtering algo- 
rithm adapted to a given linear subspace U of the space 
of signals V. We will suggest below a specific choice for 
the subspace U; but first, we discuss some general issues 
related to making such a choice. 

At the very least, we would like our choice of U to 
effect truncation of the measured data stream in both 
the time domain and the frequency domain, down to the 
intervals of time and frequency in which we expect the 
merger waveform to lie. (We assume that the duration 
of the data being analyzed, T' , will be somewhat longer 
than one's guess of the merger duration, T.) Because 
of the uncertainty principle, such a truncation cannot 
be done exactly. Moreover, for fixed specific intervals 
of time and of frequency, there are different, inequiva- 
lent ways of approximately truncating the signal to these 
intervals ]53|. The differences between the inequivalent 
methods are essentially due to aliasing effects. Such ef- 
fects cannot always be neglected in the analysis of merger 
waveforms, because the duration T ~ lOAf - 100A/ |ll]] 
of the waveform is probably only a few times larger than 
the reciprocal of the highest frequency of interest. 

It turns out that the simplest method of truncating in 
frequency (band-pass filtering) is, to a good approxima- 
tion, a projection of the type (5.4) that we are consider- 
ing. Truncating in the time domain, on the other hand, 
is not a projection of this type. 

Let us first discuss band-pass filtering. Let [cf. 
Eq. (2.8)] be a frequency domain basis of V. For a 
given frequency interval [/ cha r - A//2, f char + A//2], let 
(5-7) JJ be the subspace of V spanned by the elements dj with 
l/char — fj\ < A//2, i.e., the span of the subset of the 
frequency domain basis that corresponds to the given fre- 
quency interval. Then the projection operation Pu is to 
a moderate approximation just the band-pass filter: 



Pu 



'Kins 

E 



s J d, 



E'^ 



(5.10) 



where the notation means that the sum is taken only 
over the appr opria te range of frequencies. The reason for 
the relation ( 5.1 0| ) is that the basis dj is approximately 
orthogonal with respect to the noise inner product (2.14): 
different frequency components of the noise are statisti- 
cally independent up to small aliasing corrections of the 
order of ~ l/(/char2 n/ )- Thus, if our a priori information 
is that the signal lies within a certain frequency interval, 
then the maximum likelihood estimate of the signal is 
approximately given by passing the data stream through 
a band-pass filter. 

An analogous statement is not true in the time domain. 
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If our a priori information is that the signal vanishes out- 
side a certain interval of time, then truncating the data 
stream by throwing away the data outside of this inter- 
val will not give the maximum likelihood estimate of the 
signal. This is because of statistical correlations between 
sample points just inside and just outside of the time 
interval: the measured data stream outside the interval 
gives information about what the noise inside the interval 
is likely to be. These correlation effects become unim- 
portant in the limit T/ C h ar ~^ oo, but for BBH merger 
signals T/ char is probably < 20 The correct max- 

imum likelihood estimator of the waveform, when our 
prior information is that the signal v anish es outside of a 
certain time interval, is given by Eq. ( |5.5| ) with the basis 
{ui, . . . , u nu } replaced by the appropriate subset of the 
time-domain basis {ei, . . . , ejf>, } discussed in Sec. [nj 

Our suggested choice of subspace U and correspond- 
ing specification of a filtering method is as follows. Pick 
a wavelet basis w^- of the type discussed above. (The fil- 
tering method will depend only weakly on which wavelet 
basis is chosen) . Then, the subspace U is taken to be the 
span of a suitable subset of this wavelet basis, according 
to our prior prejudice regarding the bandwidth and du- 
ration of the signal. The dimension njj of U will be given 
by 



71(7 



A'l,; 



2TA/. 



In more detail, the filtering method would work as fol- 
lows. First, band-pass filter the data stream and truncate 
it in time, down to intervals of frequency and time that 
are several times larger than are ultimately required, in 
order to reduce the number of independent data points 
■/Vbins to a manageable number. Second, for the wavelet 
basis Wy of this reduced data set, calculate the matrix 
Wij i>j> — (wy |wj/j/). Recall that the index i corresponds 
to a time ti, and the index j to a frequency fj [cf. the 
discussion preceding Eq. Q5.2| )]. Third, pick out the sub- 
block viij i>ji of the matrix Wij i'ji for which the times ti 
and ti' lie in the required time interval, and for which 
the frequencies fj and fji lie in the required frequency 
interval. Numerically invert this matrix to obtain 1 i . 
Finally, the best-fit waveform is given by 



ibest— fit — 

ij i'ji 



(s I Vf V j>) w 



'J • 



(5.12) 



Note also that the best fit signal (|5.12| ) would also be 
obtained by calculating the SNR ( |5.7| ) for the family of 
waveforms 



(5.14) 



and by maximizing over the cy's, as discussed above. 
This essentially corresponds to building a family of 
templates with the wavelet basis, and then performing 
matched filtering with that bank of templates. 



C. Extension of method to incorporate other types 
of prior information 

A more sophisticated filtering method can be obtained 
by a generalization of the above analysis. Let us sup- 
pose that the prior PDF p^ (h) is a general multivariate 
Gaussian in h. For example, one could choose the prior 
PDF to be of the form 



p(°)(h) oc exp 



2 ^ 



{hij hij) 



(5.15) 



(5.11) where /i y are the expansion coefficients of the signal h 
on some fixed wavelet basis wy, so that h = J^ij h^Wij. 
Then, by making suitable choices of the parameters hij 
and ctij , such a PDF could be chosen to encode the infor- 
mation that the frequency content of the signal at early 
times is concentrated near / mer gc, that the signal joins 
smoothly onto the inspiral waveform, that at the end 
of merger the dominant frequency component is that of 
quasi-normal ringing, etc. For any such prior PDF, it is 
straightforward to calculate the corresponding maximum 
likelihood estimator. If the prior PDF has expected value 
ho and variance-covariance matrix So, then the estima- 
tor is 



Ibcst-fit 



•h ] 



(5.16) 



Such a waveform estimator could be calculated numeri- 
cally. 



where means the sum over the required time and 
frequency intervals. 

Note that the best-fit signal in this case is not given 
by first taking the finite wavelet transform of the re- 
duced data (i.e., finding the coefficients s lJ in the ex- 
pansion s = J^.- s' 3 Wjj) and then throwing away the 
coefficients outside of the required time and frequency 
intervals, which would yield 



(5.13) 



D. Fidelity of waveform recovery 

In this subsection we address the question of how 
close, statistically, we expect our estimated waveform 
/ibest-fit(t) to be to the original gravitational waveform 
h(t). We can quantify the closeness by means of the cor- 
relation coefficient 



fhlh 



best— fit 



y/(h | h) ^/(hbest-fit I hbest-et) 



(5.17) 
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which takes values between —1 and 1. In appendix |c| we 
show that for estimators of the form (5A), the expected 
value of C is approximately given by 



(C) 



Pbu 



(5.18) 



where p^ in is the matched filtering SNR squared per fre- 
quency bin, given by 



2 

Pbir 



bins 



(5.19) 



Thus, as one would expect, the best-guess reconstructed 
waveform agrees closely with the original gravitational 
waveform (C is close to 1) when there is large SNR 
squared in each frequency bin, and vice- versa |p3j. This 
result will also be approximately valid for waveforms ob- 
tained by simple band-pass filtering when the duration 
T of the signal satisfies T 3> 1/A/ (where A/ is the 
frequency bandwidth of the signal). 

Note that the quantity pbin is to a good approximation 
just the S NR w hich one obtains from band-pass filtering, 
from Eq. ( 4.13 ) above. Our criterion for the signal to 
be visible can therefore be written as pbin ^1- So our 
criteria for signal visibility and for reconstructed signal 
fidelity turn out to be essentially identical: the fidelity 
of signal reconstruction is good when the merger signal 
is easily visible above the noise, as is fairly obvious intu- 
itively. 



VI. USING INFORMATION PROVIDED BY 
REPRESENTATIVE SUPERCOMPUTER 
SIMULATIONS 

In this section we propose a computational strategy for 
numerical relativists to pursue, if they successfully pro- 
duce computer codes capable of simulating BBH merg- 
ers, but if running such codes is too expensive to permit 
an extensive survey of the merger parameter space. In 
this case, for LIGO/VIRGO data analysis purposes, it 
would be advantageous to do a very coarse survey of the 
parameter space aimed at determining the ranges of sev- 
eral key parameters and at answering several qualitative 
questions, as we now describe. 

• Do the waveforms contain a strong signature of an 
"innermost stable circular orbit" (ISCO) J||f4|? 
In the extreme mass ratio limit fi <C M, there is 
such an orbit, and when the smaller inspiralling 
black hole reaches it there is a transition from a 
radiation-reaction-driven inspiral to a freely falling 
plunge |5jJ . Correspondingly, there is a sharp drop 
in the radiated energy per unit logarithmic fre- 
quency dE/d(lnf) at the frequency corresponding 
to this orbit. However, in the equal-mass case, 
there may not be a sharp feature in the dE/dQxif) 



plot, if the timescale over which the orbital instabil- 
ity operates is comparable to the radiation reaction 
timescale. Or, if the spins of the individual black 
holes are large and parallel to the orbital angular 
momentum, the inspiral may smoothly join into the 
merger without any plunge. In the former case, 
the concept of ISCO would not really be meaning- 
ful; and in the latter case, there would simply be 
nothing resembling an ISCO in the evolution. Sim- 
ulations should be able to settle this issue. 

• A closely related question is: At what frequency 
does the adiabatic approximation break down? As 
seen in a coordinate system which co-rotates with 
the black holes, the system evolves on a radiation- 
reaction timescale which is initially much longer 
than the orbital period |l^,[l4|]. When does this 
separation of timescales break down? This sep- 
aration of timescales underlies proposed methods 
of calculating templates in the so-called Interme- 
diate Binary Black Hole (IBBH) regime after the 
post-Newtonian approximation fails at r ~ 12M 
[p~3| — 115|| - Therefore, fully numerical templates will 
have to be used after the adiabatic approximation 
fails. Resolving this issue will probably require ex- 
ploration of both numerical relativity simulations 
and IBBH calculations. If the black holes' spins 
are small, one might expect the transition point to 
coincide with estimates of the location of the last 
stable circular orbit around r ~ 6M; our es- 
timate (4.4) of the frequency of the transition from 
inspiral to merger roughly corresponds to this ex- 
pectation. But with large spins, the system might 
evolve adiabatically all the way into the merger. 
(Note, however, that numerical relativity will still 
be needed to model such evolution, whether it is 
adiabatic or not.) 

• What is the approximate duration of the merger 
signal, and how does it depend on the merger pa- 
rameters such as the initial spins of the black holes 
and the mass ratio? The range of merger signal 
durations will be an important input to algorithms 
for reconstructing the merger waveform from the 
noisy data stream (see Sec. |V|), particularly in those 
cases in which the ringdown and/or inspiral sig- 
nals are too weak to be seen in the data stream. 
Moreover, the duration of the waveform (together 
with its bandwidth) approximately determines the 
amount by which the SNR from band-pass filtering 
is lower than the matched filte ring SNR obtained 
with merger templates [cf. Eq. (4.13)]. 



• A closely related issue is: How much energy is ra- 
diated in the merger waves relative to the ring- 
down waves? Operationally, this question reduces 
to asking what proportion of the total waveform 
produced during the coalescence can be accurately 
fit by the ringdown's decaying sinusoid. In paper I 
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we argued that if the spins of the individual black 
holes are large and aligned with one another and 
with the orbital angular momentum, then the sys- 
tem has too much angular momentum for it to be 
lost solely through the ringdown, and that there- 
fore the ringdown waves should not dominate the 
merger. On the other hand, if the spins of the black 
holes are small, most of the radiated energy might 
well come out in ringdown waves. 

What is the frequency bandwidth in which most of 
the merger waves' power is concentrated? In Ref. 
[p"lf we assumed that when one excises in the time 
domain the ringdown portion of the signal, the re- 
maining signal has no significant power at frequen- 
cies above the quasi-normal ringing frequency of 
the final Kerr black hole. However, this assumption 
may not be valid. As with the signal's duration, the 
range of bandwidths of merger waveforms will be an 
input to algorithms for reconstructing the merger 
waveform from the noisy data (see Sec. [v|), so this 
is an important issue. 

To what extent does the merger waveform chirp 
monotonically? If we represent the merger waves 
on a time-frequency wavelet basis, then we know 
that at early times, the waves are concentrated 
at one frequency with additional contributions in 
nearby harmonics. At the end of the merger sig- 
nal, most of the power is concentrated near the fre- 
quency of quasi-normal ringing of the final black 
hole. One could extrapolate in the time-frequency 
plane a line joining twice the orbital frequency at 
the end of inspiral to the quasinormal ringing fre- 
quency at the start of ringdown. To what extent is 
the merger signal concentrated near this line in the 
time-frequency plane? 

How much of the merger can be described as higher 
order QNR modes? By convention, we have been 
calling that phase of the coalescence which is domi- 
nated by the most slowly damped, I = m = 2 mode 
the ringdown phase; but, before this mode domi- 
nates, QNR modes with different values of I and/or 
m are likely to be present. After the merger has 
evolved to the point when the merged object can be 
accurately described as a linear perturbation about 
a stationary black hole background, there might or 
might not be any significant subsequent period of 
time before the higher order modes have decayed 
away so much as to be undetectable. If simulations 
predict that higher order QNR modes are strong 
for a significant period of time, then these higher 
order QNR modes should be found by the normal 
ringdown search of the data stream; no extra search 
should be needed. 

Does the merger signal have the property that we 
can distinguish a "carrier waveform" and a "mod- 



ulation"? This separation would require that the 
carrier waveform have a fairly large number of cy- 
cles at a frequency well separated from that of the 
modulation. It would also require some mechanism 
to produce modulation, one possibility being the 
precession of the black hole spins. It is known that 
spin precession does modulate the inspiral wave- 
form |j56|j57|| , and it is possible that a similar pre- 
cession might be present during at least part of the 
merger. 

An improved understanding of thes e issues would be 
of use both in extracting [cf. Sec. V C above] and in in- 
terpreting the merger waveforms. 



VII. INFORMATION OBTAINABLE FROM THE 
MERGER PHASE OF THE WAVES USING 
TEMPLATES 

In the remainder of the paper we consider the opti- 
mistic scenario in which a complete set of supercomputer 
generated theoretical merger waveforms is available for 
data analysis. In this section we describe in qualitative 
terms the extra information that one can extract from 
the merger waves using templates. 



VIII 
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In Sec 

timate how accurate numerical templates need to be for 
data analysis purposes, and in Sec. IX we estimate the 
total number of bits of information obtainable from the 
merger waves using templates, and discuss implications 
for the requirements on one's grid of templates. 

If merger templates are available, it should be possible 
to perform Wiener optimal filtering of the data stream 
for the merger signal, just as will be done for the in- 
spiral and ringdown signals. When one has no informa- 
tion about the BBH system, one would simply filter the 
data with all numerical merger templates available, po- 
tentially a very large number. However, if the inspiral 
and/or the ringdown signals have already been measured 
(as will be the case for most detected signals), some infor- 
mation about the black hole binary's constituents will be 
available. In such cases the total number of merger tem- 
plates needed will be reduced, perhaps substantially; one 
need consider only those numerical templates whose pa- 
rameters are commensurate with the inspiral/ringdown 
measurements. 

It may turn out that black hole mergers have such a 
wide variety of behaviors that it will not be feasible to 
produce a complete family of templates, even with a nu- 
merical code that can evolve mergers and produce wave- 
forms. In such an eventuality, as mentioned in Sec. [ C 
above, the interpretation of an observed merger waveform 
could proceed as follows: The numerical relativists, with 
noisy data and numerical code in hand, carry out a se- 
ries of iterated numerical simulations, trying to produce 
a waveform that matches the observed data. (Clearly, it 
would be very useful for such a procedure to have as much 
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prior information as possible about the system's param- 
eters from the inspiral and/or ringdown phases, so that 
the numerical relativists will know where in the binary 
black hole parameter space to concentrate their computa- 
tional efforts.) Thus, matched filtering might be possible 
even if the computation of a complete set of template 
waveforms is too difficult to perform. 

In attempting to match a merger template with 
gravitational-wave data, one's primary goal would be to 
provide a test of general relativity rather than the mea- 
surement of parameters. A good match between the mea- 
sured waveform and a numerical template would consti- 
tute a strong test of general relativity and provide the 
oft-quoted unambiguous detection of black holes. (Such 
an unambiguous detection could also come from a mea- 
surement of the quasinormal ringing signal.) Although 
not the primary goal, matches between numerical merger 
templates and the data stream would also be useful in 
measuring some of the system's parameters, such as the 
total mass M or the spin parameter a of the final black 
hole ]58|] . These merger parameter measurements could 
provide additional information about the source, over 
and above that obtainable from the inspiral and ring- 
down signals. For instance, in the second example dis- 
cussed in Sec. [II (a 30 Mq BBH at z = 1), the total 
redshifted mass (1 + z)M would be essentially uncon- 
strained by the inspiral and ringdown waveforms, but 
might be extractablc from the measured merger wave- 
form. In other cases, a quantitative test of general rel- 
ativity could be obtained by verifying that parameters 
measured from the merger phase are consistent with pa- 
rameter measurements from the inspiral and ringdown 
phases. 

A close match between measured and predicted wave- 
forms for BBH mergers might also constrain some pos- 
sible theories of gravity that generalize general relativ- 
ity. Clifford Will has shown that the inspiral portion 
of the waveform for neutron star-neutron star mergers 
will strongly constrain the dimensionless parameter w 
of Brans-Dicke theory j5j|. Unfortunately, the most 
theoretically natural class of generalizations of general 
relativity compatible with known experiments, the so- 
called scalar-tensor theories f6C(] , may not be strongly 
constrained (if at all) by measurements of BBH mergers, 
since black holes, unlike neutron stars, cannot have any 
scalar hair in such theories plfl . 

In order for the above endeavors to be successful, the 
numerical templates must be sufficiently accurate. In 
the next section, we turn to a discussion of how accurate 
numerical templates need to be in order to extract the 
information in merger signals. 



VIII. ACCURACY REQUIREMENTS FOR 
MERGER WAVEFORM TEMPLATES 

There will be unavoidable errors in the waveform tem- 
plates produced by supercomputer simulations, since 



these simulations are numerical. Suppose that the physi- 
cal waveform for some particular source is h(t; 0), where 
the components of the vector 6 = (9 1 , . . . , 8 np ) repre- 
sent the various parameters upon which the waveform de- 
pends. Then, a simulation of the evolution of that source 
will predict a slightly different waveform h(t; 6)+Sh(t; 6), 
where 8h(t; 6) is the numerical error. One would like the 
numerical error to be small enough not to have a signif- 
icant effect on signal searches, parameter extraction or 
any other types of data analysis that might be carried out 
using the template waveforms. In this sec tion w e sug gest 
an approximate rule of thumb [Eqs. ( |S.l| ) and (3.2)] for 
estimating when numerical errors are sufficiently small, 
and discuss its meaning and derivation. 



A. Accuracy criterion and implementation 

The accuracy criterion can be simply expressed in 
terms of the inner product introduced in Sec. [n| above 
which is defined by Eq. ( [2.3] ) or alternatively by Eqs. 
11) — (2.14)]: For a given template h(t), our rule of 



thumb is that the numerical error 8h(t) should be small 
enough that the quantity 



satisfies 



A - 1 WO 

2 (h\h) 



A < 0.01. 



(8.1) 



■2) 



[The fractional loss in event detection rate in sig nal 



searches is ~ 3A, so the value of 0.01 in Eq. (8.2) is 
chos en to c orrespond to a 3% loss in event rate; see 
Sec. VIII B| below]. For the purpose of evaluating the 
inner product numerically, note that the absolute nor- 
malization of the noise spectrum Sh(f) is unimportant, 
and that one could use, for example, Eqs. (4.1) — (4.3) of 
Ref. ([Li] to specify the shape of the noise spectrum. 

In practice, Eq. ( |3.2| ) translates to a fractional accu- 
racy per data point hj = h(tj) of about 0.01/ y0Vp O ints> 
where Appoints is the number of numerical data points 
used to describe the templates, if the errors at each data 
point are effectively uncorrelated. If, however, these er- 
rors add coherently in the integral ( |S.l| ) , the requirement 
on fractional accuracy at each data point will be more 
stringent. 

It should be straightforward in principle t o en sure that 
numerical templates satisfy the criterion (|8.2j) . Let us 
schematically denote a numerically generated template 
as h num (t,e), where e represents the set of tolerances 
(grid size, size of time steps, etc.) that govern the ac- 
curacy of the numerical calculation. (Representing this 
set of parameters by a single parameter e is an oversim- 
plification but is adequate for the purposes of our dis- 
cussion.) One can then iterate one's calculations varying 
the parameter e in order to obtain templates that are 
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sufficiently accurate, using the following standard type 
of procedure: First, calculate the template h nnm (t, e). 
Second, calculate the more accurate template h nuin (t, s') 
for some choice of e' < e, for example e' = e/2. Third, 
make the identifications 



h(t) 
5h{t) 



(8.3) 



and insert these quantities into Eq. ([O]) to calculate A 
This allows one to assess the accuracy of the tem plate 
/i num (t, e). Finally, iterate this procedure until Eq 
is satisfied. 



B. Derivation and meaning of accuracy criterion 

The required accuracy for the numerical templates de- 
pends on how and for what purpose those templates are 
used. As discussed in the Introduction, merger templates 
might be used in several different ways: 

• They might be used as search templates for signal 
searches using matched filtering. Such searches will 
probably not be feasible, at least initially, as they 
would require the computation of an inordinately 
large number of templates. 

• For BBH events that have already been detected 
via matched filtering of the inspiral or ringdown 
waves, or by the noise-monitoring detection tech- 
nique (ll],|2(| applied to the merger waves, the 
merger templates might be used for matched fil- 
tering in order to measure the binary's parameters 
and test general relativity. This use of merger tem- 
plates could correspon d to the third scenario that 
was discussed in Sec. [C, where iterated runs of 



the supercomputer codes are performed to produce 
a template that best fits a dataset known to con- 
tain BBH merger gravitational waves. This sce- 
nario would not require that a complete set of tem- 
plates be computed and stored, and thus is some- 
what more feasible than matched filtering signal 
searches using the merger waves. 

• If one has only a few, representative supercomputer 
simulations and their associated waveform tem- 
plates at one's disposal, one might simply perform 
a qualitative comparison between the measured 
waveform and templates in order to deduce qual- 
itative information about the BBH source. This is 
the second scenario described in Sec. |Tc| . 

In this section we estimate the accuracy requirements 
for the first two of these uses of merger templates. 

Consider first signal searches using matched filtering. 
The expected SNR p obtained for a gravitational wave- 
form h(t) when using a template waveform hrit) is given 

byf! ' 



p = 



(h\h T ) 
^J{h T \h T ) 



(8.4) 



If we substitute hr(t) = h(t) + 5h(t) into Eq. (|g.4 ) and 
expand to second order in 5h, we find that the fractional 
loss Sp/p in SNR produced by the numerical error Sh(t) 
is given by 



P 



0[(6h) 3 



where 



(6h\Sh) {6h\h) 2 



(h\h) (h\hy 



(8.5) 



■6) 



Note that the quantity Ai is proportional to (Shi\8hi), 
where 8h\ is the component of Sh perpendicular to h 
with respect to the inner product (2.14). Thus, a numer- 
ical error of the for m S h(t) oc h(t) will not contribute to 
the fractional loss (|8.5|) in SNR. This is to be expected, 
since the quantity ( |B.4| ) is independent of the absolute 
normalization of the templates hr(t). 

Now, the event detection rate is proportional to the 
cube of the SNR, and hence the fractional loss in event 
detection rate that results from using inaccurate numer- 
ical templates is approximately 35p/p |33|] . If one de- 
mands that the fractional loss in event rate be less than, 
say, 3%, then one obtains the criterion |62] 



At < 0.01. 



(8.7) 



It is clear from Eqs. ( |S.lD and (8.6) that Ai < A. Hence, 



t he condition (8.7) is less stringent than the condition 
~2|) 



above. The justification for imposing the more 
stringent criterion (8.2) rather than (8.7) derives from 



the use of templates for parameter extraction. 

Consider next using merger templates for the purpose 
of measuring parameters via matched filtering. In prin- 
ciple, one could hope to measure all of the 15 parameters 
on which the merger waveforms depend by combining 
the outputs of several detectors with a complete bank of 
templates (although in practice the accuracy with which 
some of those 15 parameters can be measured is not likely 
to be very good). In the next few para graph s we derive 
an approximate condition on A [Eq. ( 8.13| )] which re- 
sults from demanding that the systematic errors in the 
measured values of all the parameters be small compared 
to the statistical errors due to detector noise. (We note 
that one would also like to use matched filtering to test 
general relativity with these waves; the accuracy crite- 
rion that we derive for parameter measurement will also 
approximately apply to tests of general relativity.) 

Let the gravitational waveform be h(t; 0), where = 
(6\ . . . ,6 n "). Let 6 a , 1 < a < n p , be the best-fit val- 
ues of 8 a given by the matched-filtering process. The 
quantities 9 a depend on the detector noise and are thus 
random variables. In the high SNR limit, the variables 
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9 a have a multivariate Gaussian distribution with (see, 
e.g., Ref. 0) 

(5§ a 5§P) = £ Q/3 , (8.8) 

where S9 a = 9 a - (9 a ) and the matrix S a/3 is defined 
after Eq. ( [2.18 ). The systematic error A0 a in the inferred 
values of the parameters 8 a due to the template error Sh 
can be shown to be approximately 



A6 a 



v a/3 ( dh 



Sh 



(8.9) 



and (3.E) we find that in order to guar- 



From Eqs. 

antee that the systematic error in each of the parameters 
be smaller than some number e times that parameter's 
statistical error, we must have 



\\8h\\\\ 2 = {5h\\\5h{) < e 2 



(8.10) 



Here 8h\\ is the component of Sh parallel to the tangent 
space of the manifold of signals S h(t, 0) discussed in 
Sec. ||. It is given by 



Shu 



Sh 



dh \ dh 



.11) 



The magnitude | \Shu \ | of this component of Sh depends 
on details of the number of parameters, and on how the 
waveform h(t, 6) varies with these parameters. However, 
a strict upper bound is given by 



ll^||ll< ll^ll- 



.12) 



If we combine Eqs. (£□]), ( fl(i| ), and flO^ ) we obtain the 
condition 



A < — . 

If we insert reasonable estimates for p 
p ~ 7, e ~ 1 — we recover the criterion ( 



the requirement 
gent than need 



■M 



(8.13) 

and e — namely, 
O). [Note that 



is probably rather more strin- 
left hand side of Eq. (|8.12|) 



be: the left hand side of Eq. 
is likely smaller than the right hand side by a factor 
~ V n p /Nh\ns, where n p is the number of parameters 
and A/bins is the dimension of the total space of signals 
V.] 



In Sec. IX below we give an alternative derivation of 
the accuracy criterion (8.13) using information theory. 

The expected order of magnitude p ~ 7 of the SNR 
that leads to the criterion ( |8.2| ) is appropriate for ground 
based interferometers such as LIGO and VIRGO [ pr| . 
However, for the space-based LISA interferometer, much 
higher SNRs are expected; see, e.g., Ref. (TT). Corre- 
spondingly, numerical templates used for testing relativ- 
ity and measuring parameters with LISA data will have 
to be substantially more accurate than those used with 
data from ground based instruments. 



IX. NUMBER OF BITS OF INFORMATION 
OBTAINABLE FROM THE MERGER SIGNAL 
AND IMPLICATIONS FOR TEMPLATE 
CONSTRUCTION 

In this section, we describe how to use information 
theory to quantify how much can be learned from a 
gravitational- wave measurement. In information theory, 
a quantity called "information" (analogous to entropy) 
can be associated with any measurement process: it is 
simply the base 2 logarithm of the number of distinguish- 
able outcomes of the measurement |24|,^5| . Equivalently, 
it is the number of bits required to store the knowledge 
gained from the measurement. Here we specialize the 
notions of information theory to gravitational wave mea- 
surements, and estimate the number of bits of informa- 
tion which one can gain in different cases. 

Let us first consider the situation in which templates 
are unavailable. Suppose that our prior information de- 
scribing the signal is that it lies inside some frequency 
band of width A/ say, that it lies inside some time in- 
terval of length T say. We will denote by /total the 
base 2 logarithm of the number of waveforms h that are 
distinguishable by the measurement, that are compati- 
ble with our prior information, and that are compatible 
with our measurement of the detector output's magni- 
tude p(s) = ||s|| p3|. We give a precise version of this 
definition in Sec. IX A] below [Eq. (|9~2|)]. Note that the 
vast majority of these 2 /total waveforms are completely ir- 
relevant to BBH mergers; the BBH merger signals are a 
small subset (the manifold S) of all distinguishable wave- 
forms with the above characteristics. However, without 
prior information about which waveforms are relevant, 
we cannot a priori ignore any waveform, and so we must 
include in our counting even the irrelevant ones. Note 
also that the quantity /total quantifies the amount of in- 
formation we gain from the measurement about the shape 
of the merger waveform; however, in the absence of any 
templates we do not learn anything about the sourc e of 
waves. In Appendix [b| we derive and in Sec. IX A we 
discuss an approximate formula for /total in terms of the 
matched filtering signal-to-nois c ra tio p and the number 
of frequency bins A/bins [Eq- (f0§]- This approximate 
formula can be understood with a sim ple, i ntuitive argu- 
ment, which we also elucidate in Sec. IX A. 



Consider now t he sit uation in w hich templates are 
available. In Sec. |LXB| below [Eq. fl9.1l| )1 we define a 
quantity /source, which is, roughly speaking, the base 2 
logarithm of the number of distinguishable waveforms 
that could have come from BBH mergers and that are 
distinguishable in the detector noise. The quantity /source 
differs from the quantity /total in that it counts only the 
subset of waveforms relevant to BBH mergers. Note that 
the information which /source quantifies is information 
about the source of the waves: when templates are avail- 
able we can relate the waveform shape to properties of 
the BBH system. In Appendix fi^ we derive an approxi- 
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mate formula [Eg. (|9.12|)1 for L 



Finally, in Sec. |IX Q we estimate how much of the in- 
formation /source is lost due to template numerical error 



[Eq. ( 9.2C )] and due to havin g insufficiently many tem- 
plates in one's grid [Eq. ( |9.24 )], and deduce requirements 
one's grid of templates must satisfy in order for the loss 
of information to be unimportant. 



A. Total information gain 

A precise definition of the total information gain /total 
is the following: Let T and Af be a priori upper bounds 
for the durations and bandwidths of merger signals, and 
let V be the vector space of signals with duration < T 
inside the relevant frequency band. This vector space V 
has dimension A/bins = 2TA/. Let s, h and n denote the 
detector output, gravitational wave signal and detector 
noise respectively, so that s = h + n. The quantities s, 
h, and n are all elements of V. Let p^°\h) be the PDF 
describing our prior information about the gravitational 
wave signal [[34J, and let p(h | s) denote the posterior PDF 
for h after the measurement, i.e., the PDF for h given 
that the detector output is s. A standard Bayesian anal- 
ysis shows that p(h | s) will be given by 

p(h \a)=K p (0) (h) exp [- (s - h | s - h) /2] (9.1) 

where K, is a normalization constant |3l| . Finally, let 
p[h | p(s)] be the PDF of h given that the magnitude ||s|| 
of the measured signal is p(s). We define the quantity 
/total to be 



'total 



= J dhp(h|s)log 2 



P(h|s) 
p[h\p(s)} 



(9.2) 



By this definition, /total is the relative information of 
the probability distributions p[h | p(s)] and p(h | s) [p5[ . 
In Appendix [B] we show that the quantity ( |9.2[ ) in fact 
represents the base 2 logarithm of the number of distin- 
guishable wave shapes that could have been measured 
and that are compatible with one's measurement of the 
magnitude p(s) of the data stream Thus, one learns 
/total bits of information about the waveform h when one 
goes from knowing only the magnitude ||s|| of the detec- 
tor output to knowing the actual detector output s. 

We also show in Appendix [b] that in the limit of no 
prior information other t han T and A/, an approximate 
formula for the quantity (9.2) is 



/total = ^bins fog 2 [p(s) 2 /A/"bin 8 ] + O [In A/fains] ■ (9.3) 

The formula (^^) is valid in the limit of large A/bins for 
fixed p(s) 2 /A/bmsj and moreover applies only when 



p(s) 2 /A/bin S > 1; 
see below for further discussion of this point. 



(9.4) 



Ther e is a simple and intuitive way to understand the 
result (9.3). Let us fix the gravitational waveform, h, 
considered as a point in the A/bins-dimensional Euclidean 
space V. What is measured is the detector output h + n, 
whose location in V is displaced from that of h. The di- 
rection and magnitude of the displacement depend upon 
the particular instance of the noise n. However, if we 
average over an ensemble of realizations of the noise, we 
can see that the displacement due to the noise is in a ran- 
dom direction and has rms magnitude V A/bins (since on 
an appropriate basis each component of n has rms value 
1). Therefore, all points h' lying inside a hypersphere of 
radius vWbins centered on h arc effectively indistinguish- 
able from each other. The volume of such a hypersphere 



A/iji 



(9.5) 



where CV binB is a constant whose value is unimportant. 
When we measure a detector output s with magnitude 
p(s), the set of signals h that could have given rise to an 
identical measured p(s) will form a hypersphere of radius 
~ p(s) and volume 



.Vb, 



(9.6) 



The number of distinguishable signals in this large hyper- 
sphere will be approximately the ratio of the two volumes 
(|9.5|) and (|9.6[ ) ; the base 2 logarithm of this ratio is the 
quantity (|T3). 



Equation (9.3) expresses the information gain as a 
function of the magnitude of the measured detector out- 
put s. We now re-express this information gain in terms 
of properties o f the gravit ation al wave signal h. For a 
given h, Eqs. ( 2. 16 ) and ( 2.17 ) show that the detector 
output's magnitude p(s) will be approximately 



P(s) 



- Abins ± V^b 



(9.7) 



Here p 2 = ||h|| 2 is the SNR squared (f2j| that would be 
achieved if matched filtering were possible (if templates 
were available) . We use p simply as a convenient measure 
of signal strength; in this context, it is meaningful even 
in situations where templates are unavailable and where 
matche d fil tering cannot be carried out. The last term 
in Eq. (9/7) gives the approximate size of the statistical 
fluct uati ons in p(s) 2 . We now substitute Eq. ( |9.7j ) into 
Eq. (|9.3[) and obtain 



(9.8) 



/total = -A/bins fog 2 [l + /> 2 /M>ms] 



1 + 







h ills 



Also, the condition ( |9.4|) for the applicability of Eq. (p.3j), 
when expressed in terms of p instead of p(s), becomes 



M, 



± 



1 



> o, 



bins 



(9.9) 
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which will be satisfied with high probability when p ity, one could in principle translate these /total — 1st 



A/" b Y„ 4 s |6fJ- In the regime p < A^V*, the condition flgj ) 



1/4 



is typically not satisfied and the formula (9.3) does not 
apply; we show in Appendix |^ that in this case the 
information gain fl9.2| ) is usually very small, depending 
somewhat on the prior PDF p(°)(h). [In contexts other 
than BBH merger waveforms, the information gain can 
be large in the regime p <C A/jV^ if the prior PDF p(°5 (h) 
is very sharply peaked. For example, when one considers 
measurements of binary neutron star inspirals with ad- 
vanced LIGO interferometers, the information gain in the 
measurement is large even though typically one will have 
p <C A/jJ-^g, because we have very good prior information 
about inspiral waveforms.] 

As an example, a typical detected BBH event might 
have an SNR for the merger signal of p ~ 10, and the 
num ber of frequency bins A/bins might be ~ 30 [ |TT[ ] . Then, 
Eq. tells us that ~ 3 x 10 9 w 2 32 signals of the same 
magnitude could have been distinguished, thus the num- 
ber of bits of information gained is ~ 32. More generally, 
for ground based interferometers we expect p to lie in 
the range 5 < p < 100 plj| , and therefore we expect 
10 < /total ^120; and for LISA we expect p to typically 
lie in the range 10 3 < p < 10 5 so that 200 < /total < 400. 



B. Amount of information gained about the wave's 
source 

Consider now the idealized situation in which a com- 
plete family of accurate theoretical template waveforms 
h(8) are available for the merger. Without templates, we 
gain /total bits of information about the shape of the grav- 
itational waveform in a measurement. With templates, 
some — but not all — of this information can be translated 
into information about the BBH source. For instance, 
suppose in the example considered above that the num- 
ber of distinguishable waveforms that could have come 
from BBH mergers and that are distinguishable in the 
detector noise is 2 25 . (This number must be less that the 
total number ~ 2 32 of distinguishable waveform shapes, 
since waveforms from BBH mergers will clearly not fill 
out the entire function space V of possible gravitational 
waveforms.) In this example, by identifying which tem- 
plate best fits the detector output, we can gain ~ 25 
bits of information about the BBH source (e.g. about 
the black holes' masses or spins). We will call this num- 
ber of bits of information /source! clearly / sourcc < /total 
always. 

What of the remaining /total — /source bits of informa- 
tion (7 bits in the above example)? If the detector output 
is close to one of the template shapes, then this closeness 
can be regarded as evidence in favor of the theory of grav- 
ity (general relativity) used to compute the templates, so 



the L 



total " 



/s, 



extra bits of information can be viewed 



as information about the validity of general relativity. If 
one computed templates in more general theories of grav- 



bits of information into a quantitative form and obtain 
constraints on the parameters entering into the gravi- 
tational theory. However, with only general-relativistic 
templates at one's disposal, the information contained in 
the /total — /source bits will simply result in a qualitative 
confirmation of general relativity, in the sense that one 
of the general relativistic templates will provide a good 
fit to the data. 

It is possible to give a precise definition of the num- 
ber of bits of information gained about the BBH source, 
/source, in the following way. Let p(8 | s) denote the prob- 
ability distribution for the source parameters 8 given the 
measuremen t s. This PDF is given by a formula analo- 
gous to Eq. Q H 



p(8 \s)=K p {0) (9) exp [- (s - h(0) | s - h(0)) /2] , 

(9.10) 

where p^ (0) is the prior PDF for 6 and K. is a normal- 
ization constant. Let p[8 | p(s)] be the posterior PDF for 
8 given that the magnitude ||s|| of the measured signal 
is p(s). Then we define 



d6p{8 |s)log 2 



p(8 | s) 



P [8\p(s)} 



(9.11) 



The number of bits of information (9.11) gained about 
the BBH source will clearly depend on the details of how 
the gravitational waveforms depend on the source param- 
eters, on the prior expected ranges of these parameters, 
etc. In Appendix |b] we argue that to a rather crude ap- 
proximation, /source should be given by the formula (9.S) 
with A/bins replaced by the number of parameters A/" pa ram 
on which the waveform has a significant dependence: 



/ s , 



1 



a rai n 

lo g2 [i + P 2 /Af P a. ram ] ■ (9-12) 



Note that the quantity A/param should be bounded above 
by the quantity n p discussed in Sec. 0, but may be some- 
what smaller than n p . This will be the case if the wave- 
form depends o nly v ery weakly on some of the parameters 
8 a . Equation ( 9.12 ) is only valid when A/param < A/bins- 
For BBH mergers we expect A/param ^ 15, which from 
Eq. ( |9.12j ) predicts that /source lies in the range ~ 10 bits 
to ~ 70 bits for signal-to-noise ratios p in the range 5 to 
100 (the expected range for ground based interferometers 
fy}), and ~ 100 bits to ~ 200 bits for p in the range 10 3 
to 10 5 expected for LISA pi. 



C. Loss of information about source due to template 
inaccuracies or to sparseness of the lattice of 
templates 



As we discussed in Sec. VIII, numerical templates will 



contain some unavoidable error due to the calculational 
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technique. In this section we analyze how that error af- 
fects the information gained in the measurement process, 
and use this analysis to infer the maximum allowable 
template error. 
Let us write 



h T (0) = h(0)+£h(0), 



(9.13) 



where h(<?) denotes the true waveform shape, \it(8) the 
numerical template, and Sh(6) the numerical error. It is 
clear that th e num erical error will reduce the amount of 
information (9.11) one can obtain about the source. We 



can make a crude estimate of the amount of reduction 
in the following way. We model the numerical error as a 
random process with 



(Shi Shj) = C 



131 



(9.14) 



where for simplicity we take Cij = XT^ for some constant 



A. Here is the matrix introduced in Eq. (2.11). The 



expec ted value of (Sh \ Sh) is then given by, from Eq. 
(PI, 



<(<5h|5h)) = >:'' dhiSh,) 

— E y XTij — AAbins, 



(9.15) 



where we have used Eq. (plo| ). We can write A in terms 
of th e qu antity A d iscussed in Sec. VIII by combining 
Eqs. (§T|) and ( pU5| ), yielding 



A = 2A 



M 



bins 



(9.16) 



The information /s 0urce which one obtains when mea- 
suring with inaccurate templates can be calculated by 
treating the sum of the detector noise n and the tem- 
plate numerical error Sh as an effective noise n( cff ) . This 
effective noise is characterized by the covariance matrix 



/ (off) (cff)> 

(n\ ' nj ') 



r, 



Ar, : 



(9.17) 



Thus, in this simplified model, the effect of the numerical 
error is to increase the noise by a factor 1 + A. The new 
information gain /s 0urce is therefore given by Eq. (9.12) 
with p replaced by an effective SNR p', where 



{p'f = 



1 + A 



(9.18) 



If wc now combine Eqs. (|9~ll) , |Tl|) and (|t|), wc find 
that the loss in information due to template inaccuracy 



SL, 



= /s, 



- /' 

source 



(9.19) 



is given by 



SL 



■A/par am ~r~ 



M 



par am 



<3(A 2 



(9.20) 



To ensure that S /source 1 bit, we therefore must have 



A < 



M, 



M> 



Mpa 



(9.21) 



This condi tion is a more accurate ve rsion of the condi- 
tion ( |S.13 ) that was derived in Sec. VII] . It approxi- 
mately reduces to the condition ( B.13 ) for typical BBH 



events (except in the unrealistic limit p 2 <C M 



M 



10 and 10 <M, ins < 100 11 



Turn next to the issue of the required degree of fineness 
of a template lattice; i.e., the issue of how close in param- 
eter space successive templates must be to one another. 
This is mostly relevant to the third scenario described in 



Sec. 1C, in which numerical relativists are able to simu- 



late essentially arbitrary BBH mergers, and to carry out 
a large number of such simulations. We can parameter- 
ize the degree of fineness by a dimensionless parameter 
£grid m the following way: the lattice is required to have 
the property that for any possible true signal h(0), there 
exists some template h(8*) in the lattice with 



(h(0)|h(0*)) 



V(h(0)|h(0)) v/(h(0*)|h(0*)) 



> 1 



-grid- 



(9.22) 



The quantity 1 — £ gr id is called the minimal match [j33| . 
Suppose that one defines a me tric on the space V of 
temp lates using the norm (2.4). It then follows from 
Eq. ( 9.22| ) that the largest possible distance -D m ax be- 
tween an incoming signal h(6) and some rescaled tem- 
plate Ah(0*) with A > is 



v /2e grid p, 



(9.23) 



where p is the matched filtering SNR (2.5) of the incom- 
ing signal. 

We can view the discreteness in the template lattice as 
roughly equivalent to an ignorance on our part about the 
location of the manifold S of true gravitational wave sig- 
nals between the lattice points. The maximum distance 
any correct waveform h(9) could be away from where 
we may think it should be (where our guess is for exam- 
ple obtained by linearly extrapolating from the nearest 
points on the lattice) is of order D ma x- We can crudely 
view this ignorance as equivalent to a numerical error Sh 
in the templ ates of m agnit ude \\Sh\\ — -^/2£ gr idP- Com- 
bining Eqs. 
mation SI so , 



.1 



and (9.2C) shows that the loss of infor- 
due to the discreteness of the grid should 



therefore be of order 



SL, 



par am 



Ma 



-grid- 



(9.24) 



The grid fineness e gr i d should be chosen to ensure that 
^source is small compared to unity, while also taking into 
account that the fractional loss in event detection rate 
for signal searches d ue to t he coarseness of the grid will 
be < 3e gr id; see Sec. VIII B| above and Refs. [ f33| , p2| . 
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X. CONCLUSIONS 

Theoretical template waveforms for the merger phase 
of BBH coalescences from numerical relativity will be a 
great aid to the analysis of detected BBH coalescence 
events. A complete bank of templates could be used to 
implement a matched filtering analysis of merger data, 
which would allow measurements of the binary's parame- 
ters and tests of general relativity in a strong field, highly 
dynamic, highly non-sphcrical regime. Such matched fil- 
tering may also be possible without a complete bank of 
templates, if iterative supercomputer simulations are car- 
ried out in tandem with the data analysis. A match of 
the detected waves with those produced by numerical 
relativity will be a triumph for the theory of general rel- 
ativity and an unambiguous signature of the existence of 
black holes. Qualitative information from representative 
supercomputer simulations will also be useful, both as 
an input to algorithms for extracting the merger wave- 
form's shape from the noisy interferometer data stream, 
and as an aid to interpreting the observed waveforms and 
making deductions about the waves' source. 

We have derived, using several rather different concep- 
tual starting points, accuracy requirements that numeri- 
cal templates must satisfy in order for them to be useful 
as data analysis tools. We first considered matched fil- 
tering signal searches using templates; here the loss in 
event rate due to template inaccuracies is simply related 
to the degradation in SNR, and leads to a criterion on 
template accuracy. Approximately the same criterion is 
obtained when one demands that the systematic errors in 
parameter extraction be small compared to the detector- 
noise induced statistical errors. Finally, we quantified 
the information that is encoded in the merger waveforms 
using the mathematical framework of information the- 
ory, and deduced how much of the information is lost 
due to template inaccuracies or to having insufficiently 
many templates. We deduced approximate requirements 
that templates must satisfy (in terms of both accuracy 
of individual templates and of the spacing between tem- 
plates) in order that all of the waveforms information can 
be extracted. 

The theory of maximum likelihood estimation is a use- 
ful starting point for deriving algorithms for reconstruct- 
ing the gravitational waveforms from the noisy interfer- 
ometer output. In this paper we have discussed and de- 
rived such algorithms in the contexts of both a single 
detector and a network of several detectors; these algo- 
rithms can be tailored to build-in many different kinds 
of prior information about the waveforms. 
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APPENDIX A: WAVEFORM 
RECONSTRUCTION USING A NETWORK OF 
DETECTORS 

In this appendix we describe how to extend the fil- 
tering methods discussed in Sec. |y| above from a single 
detector to a network of an arbitrary number of detec- 
tors. The underlying principle is again simply to use the 
maximum likelihood estimator of the waveform shape. 
We also explain the relationship between our waveform 
reconstruction me tho d an d th e method of Gursel and 
Tinto ]22| . Sees. Al and A 2 below overlap somewhat 
with unpublished analyses by Sam Finn |23[|. 



We start by establishing some notations for a network 
of detectors; these notations and conventions follow those 
of Appendix A of Ref . |27j] . The output of such a network 



can be represented as a vector s(t) = [si(i), 



,(*)]. 



where is the number of detectors, and s a (t) is the 
strain amplitude read out from the ath detector J66| . 
There will be two contributions to the detector output 
s(t) — the intrinsic detector network noise n(t) (a vector 
random process), and the true gravitational wave signal 
h(t) : 



s(t) = h(t) +n(t). 



(Al) 



We will assume that the detector network noise is sta- 
tionary and Gaussian. In reality the noise will be non- 
stationary and non-Gaussian, but understanding the op- 
timal method of waveform reconstruction under our ide- 
alized assumptions is an important first step towards 
more sophisticated waveform reconstruction algorithms 
that incorporate more information about the nature of 
the noise. With this assumption, the statistical proper- 
ties of the detector network noise can be described by the 
auto-correlation matrix 

C n (T) ab = (n„(t + r)n b (t)) - (n a (t + r)) (n b (t)}, (A2) 

where the angular brackets mean an ensemble average or 
a time average. The Fourier transform of the correlation 
matrix, multiplied by two, is the power spectral density 
matrix: 



S h (f) ab = 2 dTe 2mfT C n {r) ab . 



(A3) 



The off-diagonal elements of this matrix describe the ef- 
fects of correlations between the noise sources in the var- 
ious detectors, while each diagonal element Sh(f)aa is 
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just the usual power spectral density of the noise in the 
ath detector. We assume that the functions Sh(f) a b for 
a 7^ b have been measured for each pair of detectors. 

The Gaussian random process n(t) determines a natu- 
ral inner product (. . . | . . .) on the space of functions h(t), 
which generalizes the inner product ( [2.3] ) discussed in the 
body of the paper in the context of a single detector. The 
inner product is defined so that the probability that the 
noise takes a specific value ng(t) is 



p[n = tt-qJ oc e 



-(n \n )/2 



(A4) 



and it is given by 

=4Re J df g a (f)* [Sfctf)" 1 ] "&»(/)• (A5) 

See, e.g., Appendix A of Ref. j2^] for more details. 

Turn, now, to the relation between the gravitational 
wave signal h a (t) seen in the ath detector, and the two 
independent polarization components h+(t) and h x (t) of 
the gravitational waves. Let x a be the position and d a 
the polarization tensor of the ath detector in the detector 
network. By polarization tensor we mean that tensor d a 
for which the detector's output h a (t) is given in terms of 
the waves' transverse traceless strain tensor h(x, i) by 



h a (t) = d a : h(x a ,t), 



(A6) 



where the colon denotes a double contraction. A gravi- 
tational wave burst coming from the direction of a unit 
vector m will have the form 



h(x,t) 



A=+,x 



h A (t + m-x.) e^, 



(A7) 



where e+j and are a basis for the transverse trace- 
less tensors perpendicular to m, normalized according to 
e m : e m = 25 AB . (Note that the notation n is typically 
used to denote direction to the source; we use instead m 
because we have denot ed b y n the detector noise.) Com- 
bining Eqs. ( |A6| ) and (A7) and switching from the time 
domain to the frequency domain using the convention 
(O) yields 



h a (f) = F a A (m) h A (f) e~ 2 ^ (m) , (A8) 

where the quantities 

F*(m) = et : d a , (A9) 

for A = +, x , are detector beam-pattern functions for the 
ath detector |37) and r a (m) = m ■ x a is the time delay 
at the ath detector relative to the origin of coordinates. 



1. Derivation of posterior probability distribution 

We now construct the probability distribution 
V[m,h + (t),h x (t)\s(t)] for the gravitational waves to be 



coming from direction m with waveforms h + (t) and 
h x (t), given that the output of the detector network is 
s(t). Let p(°)(m) and pW[h A (t)] be the prior probabil- 
ity distributions for the sky position m (presumably a 
uniform distribution on the unit sphere) and waveform 
shapes h A (t), respectively. A standard Bayesian analy- 
sis a long the lines of that given in Ref. and using 
Eq. ( |A4| ) gives 

r[m,h A (t)\s(t)} = fcp (0 V)p (0) [M*)] 

x exp [- (s-h\s-h) /2I , (A10) 



where JC is a normalization constant and h = 
(hi, . . . , h nd ) is understood to be the function of m and 
h A (t) given by (the Fourier tr ansfo rm of) Eq. (A8). 

We simplify the expression ( A10 ) in two stages. First, 
we reduce the argument of the exponential from a double 
sum over detectors to a single sum over detector sites. 
In the next few para graphs we c arry out this reduction, 
leading to Eqs. ( |A18|) and ( |A19[ ) below. 

We assume that each pair of detectors in the detector 
network comes in one of two categories: (i) pairs of de- 
tectors at the same detector site, which are oriented the 
same way, and thus share common detector beam pat- 
tern functions F^(xa) (for example the 2 km and 4 km 
interferometers at the LIGO Hanford site); or (ii) pairs of 
detectors at widely separated sites, for which the detector 
noise is effectively uncorrelated. Under this assumption 
we can arrange for the matrix S^(/) to have a block di- 
agonal form, with each block corresponding to a detector 
site, by choosing a suitable ordering of detectors in the 
list (1, . . . , rid). Let us denote the detector sites by Greek 
indices a, (3, 7 . . ., so that a runs from 1 to n s , where n s 
is the number of sites. Let T> a be the subset of the list of 
detectors (1, . . . , n<j) containing the detectors at the ath 
site, so that any sum over detectors can be rewritten 



E= E E 



(All) 



a=l 



a=l a £V a 



Thus, for example, for a 3 detector network with 2 detec- 
tors at the first site and one at the second, T>\ = {1,2} 
and T>2 — {3}. Let F^(m) denote the common value of 
the beam pattern functions ( |A9| ) for all the detectors at 
site a. Let S Q (/) denote the ath diagonal sub-block of 
the matrix S/ l (f). Then if we define 



A= (s-h\s-h) , 



(A12) 



[ the quantity which appear s in the exponential in Eq. 
( |A10| )], we obtain from Eq. ( jAp ) 



A = J2 4 Re / df Y, [ s °(f) 

a=l ^° a,b£T> a 

x [s a (frT b [h(i)-h b (f) 



Kit)* 



(A13) 
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Next, we note from Eq. (|A^) that the value of h a will 
be the same for all detectors at a given site a. If wc 
denote this common value by h n , then we obtain after 
some manipulation of Eq. (A13) 



A,B = + ,x 

h B (f)-h B (fj\ +S(f,m) 



A' = 4Re[ df\ E ® AB (f- m ) 

(A20) 



A = E 4Re 

a=l 



df 



s a (f)-h a (fW 

si cB) (f) 



•Aa(/) 



Here 



(A14) 



e iB (/,m)E 



The meanings of the various symbols in Eq. ( |A14| ) are as 
follows. The quantity Sa S \f) is defined by 



^ F^(m)Fj(m) 

Z_, c (eff), 



sr\f) 



(A21) 



i-- E M/r 1 ] 



(/) a,6GX', 



(A15) 



= e AB (/,m) £ F a fl (m) S - a (/)e 2 *« 



(m) 



a=l 



(A22) 



and can be interpreted as the effective overall noise spec- 
trum for site a psfl . The quantity s a is given by 

~ Sa ( f )^si cs Hf) e [s«(/) _1 ] *%(/), (Aie) 

0,6 ev a 

and is, roughly speaking, the mean output strain ampli- 
tude of site a. Finally, 



A«(/) 



E 

a,b £T> a 



s a (f)*h(f)\ [s a (f) 



where ®ab is the inverse matrix to <d AB , and 

S(f, m)=E l^(/)| 2 - e AB ~h A (frl B (f). (A23) 



Estimating the waveform shapes and the 
direction to the source 



-^ cff) (/) E [M/rT c M/r 1 ] 



The quantity A Q is independent of m and fiA(t), and is 
therefore irrelevant for our purposes; it can be absorbed 
into the normalization constant K. in Eq. ( AlOj ). This 
unimportance of A Q occurs because we are assuming that 
there is some signal present. However, in situations where 
one is trying to assess the probability that some signal 
(and not just noise) is present in the outputs of the de- 
tector network, the term A a is very important. In effect, 
it encodes the discriminating power against noise bursts 
which is due to the presence of detectors with different 
noise spectra at one site (e.g., the 2km and 4km interfer- 
ometers at the LIGO Hanford site). We drop the term 
A Q from now on. 

The probability distribution for the wave form s hape s 
and sky direction is now given by, from Eqs. ( A10), ( |A12| ) 
and QA14D, 



Equations ( A18 ) and ( A20 ) constitute one of the main 
results of this Appendix, and give the final and general 
robability distribution for m and fiA(t). In the next few 
paragraphs we discuss its implications. As mentioned at 
the start of the appendix, we are primarily interested in 
situations where the direction m to the source is already 
known. However, as an aside, we now briefly consider the 
more general context where the direction to the source 
as well as the wavefor m sh apes are unknown 
Starting from Eq 



(|Al< 



one could use either maxi- 
mum likelihood estimators or so-called Bayes estimators 
p7| , |69| -[7]ll to determine "best-guess" values of m and 
h,A{t). Bayes estimators have significant advantages over 
maximum likelihood estimators but are typically much 
more difficult to compute, as explained in, for example, 
Appendix A of Ref. [^7|. The Bayes estimator for the 
direc tion to the source will be given by first integrating 
Eq. (A18) over all waveform shapes, which yields 



V[m\s(t)} = /Cp (0) (m)X>(m) exp 



-2 



V[m, h A (t)\s(t)] = JCp^(m)p (0, [hA(t) 



-A'/2 



where 



A' = E 4Re 



a — 1 



df 



M/)-M/)l s 
^ off) (/) 



(Ai8) 



(A19) 



dfS(f,m) , 
(A24) 



Finally, we express this probability distribution directly 
in terms of t he wavefo rms h + (t) and h x (t) by substitut- 
ing Eq. (|A8|) into Eq. (A19), which gives 



where 2?(m) is a determinant-type factor that is pro- 
duced by integrating over the waveforms h,A(t). This 
factor encodes the information that the detector network 
has greater sensitivity in some directions than in oth- 
ers, and that other things being equal, a signal is more 
likely to have come from a direction in which the net- 
work is more sensitive. The Bayes estimator of m is 
now obtained simply by calculating the expected value 
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of m with respect to the probability distribution (A24). 
The simpler, maximum likelihood estimator of m is given 
by choosing the values of m [and of Ha (t)] which max- 
imize the probability distribution ( A18 ), or equivalently 
by minimizing the quantity 



dfS(f,m). 



(A25) 



Let us de note this value of m by itiml(s)- Note that the 
quantity (A25) encodes all the information about time 
delays between the signals detected at the various de- 
tector sites; as is well known, directional information is 
obtained primarily through time delay information [^9| . 

In Ref. [Q, Giirsel and Tinto suggest a method of es- 
timating m from s(t) for a network of three detectors. 
For white noise and for the special case of one detector 
per detector site, the Giirsel-Tinto estimator is the same 
as the maximum likelihood estimator iiiml(s) just dis- 
cussed, with one major modification: in Sec. V of Ref. 
p2~f , Giirsel and Tinto prescribe discarding those Fourier 
components of the data whose SNR is below a certain 
threshold as the first stage of calculating their estimator. 

Turn, now, to the issue of estimating the waveform 
shapes h + (t) and h x (t). In general situations where 
both m and IiA(t) are unknown, the best way to pro- 
ceed in principle would be to integrate the probability 
distribution ( A18] ) over all solid angles m to obtain a re- 
duced probability distribution V[h,A(t)\s(t)] for the wave- 
form shapes, and to use this reduced probability distri- 
bution to make estimators of fiA(t). However, such an 
integration cannot be performed analytically and would 
not be easy numerically; in practice simpler estimators 
will likely be used. One such simpler estimator is the 
maxi mum likelihood estimator of hA{t) obtained from 
Eq. ( |A18| ). In the case of no prior information about the 
waveform shape when the prior distribution p^[hA(t)] 
is very broad, this maximum likelihood estimator is sim- 
ply h,A{t) evaluated at the value mML(s) of m discussed 
above. 

For BBH mergers, in many cases the direction m to the 
source will have been measured from the inspiral portion 
of the waveform, and thus for the purposes of estimat- 
ing the merger waveform's shape, m can be regarded as 
known. The probabil ity d istribution for hA{t) given m 
and s(t) is, from Eq. (A18), 



V{h A (t)\m,s(t)}=IC'P {()) lhA(t)}< 



-A"/2 



(A26) 



Here KJ i s a normalization constant, and A" is given by 
Eq. ( A20) with the term S(f, m) omitted. The maximum 
likelihood estimator of liA{t) obtained from this proba- 
bility distribution in the limit of no prior information is 
again just fiAj t). T he formula for the estimator h,A{t) 



given by Eqs. ( |A15[ ), ( |A16[ ), flA2l| ) and ( |A22| ) is one of 
the key results of this appendix. It specifies the best- 
fit waveform shape as a unique function of the detector 
outputs s a (t) for any network of detectors. 



3. Incorporating prior information 

In Sec. 0, we suggested a method of reconstruction of 
the merger waveform shape, for a single detector, which 
incorporated assumed prior information as to the wave- 
form's properties. In this appendix, our discussion so far 
has neglected all prior information about the shape of the 
waveforms h + {t) and h x (t). We now discuss waveform 
estimation for a network of detectors, incorporating prior 
information, for fixed sky direction m. 

With a few minor modifications, the entire discussion 
of Sec. can be applied to a network of detectors. The 
required modifications are as follows. First, the linear 
space V should be taken to be the space of pairs of wave- 
forms {h + (t), h x (t)}, suitably discretized, so that the 
dimen sion of V is 2T' /At. Second, the inner product 
( |2.14| ) must be replaced by a discrete version of the inner 
product 

poo 

({h+, h x }\{k+, k x }) =4Re / dfQ AB (f,m) 

Jo 

x~h A (f)*~k B (f), (A27) 
since the inner product (A27|) plays the same role in 



t he p robability distribution ( A26) as the inner product 
( |2 .14 ) plays in the distribution ( |5.3| ). Third, the esti- 
mated waveforms {h+(t), h x (t)} given by Eq. (|A22j ) take 
the place of the measured waveform s in Sec. |V|, for the 
same reason. Fourth, the wavelet basis used to specify 
the prior information must be replaced by a basis of the 
form {w^(t) 7 w^^t)}, where wf-{t) is a wavelet basis of 
the type discussed in Sec. [v| for the space of waveforms 
h + (t) , and (t) is a similar wavelet basis for the space of 
waveforms h x (t). The prior information about, for exam- 
ple, the assumed duration and bandwidths of the wave- 
forms h+(t) and h x (t) can then be represented exactly 
as in Sec. [v|. With these modifications, the remainder of 
the analyses of Sec. |y| apply directly to a network of de- 
tectors. Thus the "perpendicular pro jecti on" estimator 
( |5.4| ) and the more general estimator ( |5.16 ) (correspond- 
ing to the more general algorithm described in Sec. |V C| ) 
can both be applied to a network of detectors. 



4. The Giirsel-Tinto waveform estimator 

As mentioned in Sec. [v] above, Giirsel and Tinto have 
suggested an estimator of the waveforms h+(t) and h x (t) 
for networks of three detector sites with one detector at 
each site |7^| , in the case when the direction m to the 
source is known. In our notation, the construction of 
that estimator can be summarized as follows. First, as- 
sume that the estimator is some linear combination of 
the outputs of the independent detectors corrected for 
time delays: 

~(GT) JL 

h A (/) = £^(m)e 2 ^ m )S Q (/). (A28) 



2G 



Here h _ 



~(GT) 



is the Giirsel-Tinto ansatz for the estimator, 



and w'a are some arbitrary constants that depend on m. 
[Since there is only one detector per site we can neglect 
the distinction between the output s a (/) of an individual 
detector and the output s a (f) of a detector site.] Next, 
demand that for a noise- free signal, the esti mat or red uce s 
to the true waveforms fiA(t). From Eqs. (Al) and (|aJ) 
above, this requirement is equivalent to the equation 



£<(m)i^(m) = <5f. 

a=l 



(A29) 



There is a two dimensional linear space of tensors w°a 
whic h sat isfy Eq. ( A29| ). Finally, choose w\ subject to 
Eq. (A29) to minimize the expected value with respect 
to the noise of the quantity 



E 



dt \hf T) (t) 



h A {tf 



(A30) 



where h^ T \t) is given as a functi onal of h.A (t) a nd the 



detector noise n„ (i) by Eqs. Q, @ and (JA2§). 

It is straightforward to show by a calculation using 
Lagrange multipliers that the resulting estimator is given 

by@ 



hf T \t) = h A (t). 



(A31) 



In other words, the Giirsel-Tinto estimator coincides with 
the maximum likelihood estimators of h+(t) and h x (t) 
discussed in this appendix in the case of little prior infor- 
mation. However, the estimators discussed here general- 
ize the Giirsel-Tinto estimator by allowing an arbitrary 
number of detectors per site [with the effective output 
and e ffect ive noi se sp ectrum of a site being given by 
Eqs. ( Al6| ) and (A15) above], by allowing an arbitrary 
number of sites, and by allowing one to incorporate prior 
information about the waveform shapes. 



APPENDIX B: MEASURES OF INFORMATION 

In this appendix we substantiate the claims concerning 
information theory made in Sec. IX of the body of the 
paper. First, we argue that the concept of the "re lative 
information" of two PDFs introduced in Eq. (9.2) does 



of two PDFs introduced in Eq. 
have the interpretation we ascribed to it: it is the base 2 
logarithm of the number of distinguishable measurement 
outc omes. Seco nd, we derive the approximate equations 
(ph and (|9l2|). 



Consider first the issue of ascribing to any measure- 
ment process a "number of bits of information gained" 
from that process, which corresponds to the base 2 log- 
arithm of the number of distinguishable possible out- 
comes of the measurement. If p^ (x) is the PDF for the 
measured quantities x = (x , . . . , x n ) before the mea- 
surement, and p{x) is the corresponding PDF after the 



measurement, then the relative information of these two 
PDFs is defined to be 



cTxp(x) log 2 



p( x ) 



(Bl) 



In simple examples, it is easy to see that the quantity 
(Bl) reduces to the number of bits of information gained 
in the measurement. For instance, if x = (a; 1 ) and the 
prior PDF p(°> constrains x 1 to lie in some range of size 
X, and if after the measurement x 1 is constrained to lie 
in a small interval of size Aa;, then / w log? (A/ Ax), as 
one would expect. In addition, the quantity (Bl) has the 
desirable feature that it is coordinate independent, i.e., 
that the same answer is obtained when one makes a non- 
linear coordinate transformation on the manifold param- 
eterized by (x \ . . . , x n ) before evaluating the quantity 
(Bl). For th ese reasons, in any measurement process, 



the quantity (Bl) can be interpreted as the number of 
bits of information gained pa] . 



1. Explicit formula for the total information 

A s a foundation for deriving the approximate formula 
( |9.8[ ), we derive in this subsection an e xplic it formula 
[Eq. ( |B13| )] for the total information gain (9.2) in a grav- 
itational wave measurement. We shall use a basis of V 



where the matrix (2.11) is unity, and for ease of notation 
we shall denote by J\f the quantity which was denoted by 
A/bins in the body of the paper. 

First , we assume that the prior PDF p^ (h) appearing 
in Eq. ( |9.l[ ) is a function only of ft. = ||h||. In other words, 
all directions in the vector space V are taken to be, a pri- 
ori, equally likely, when one measures distances and an- 



gles with the inner product (2.14). It would be more real- 



istic to make such an assumption with respect to a noise- 
independent inner product like {hi | /12) = / dthi{t)h2{t), 
but if the noise spectrum Sh{f) does not vary too rapidly 
within the bandwidth of interest, the distinction is not 
too important and our assumption will be fairly realistic. 
We write the prior PDF as |74| ] 



pW{h)d*h = 



2tt^/ 2 
IW/2) 
p (0 \h) dh. 



',Ar-l _(0) 



pW{h)dh 



(B2) 



The quantity p^{h) dh is the prior probability that the 
signal h will have an SNR ||h|| between h and h + dh. 
The exact form of the PDF jjw {h) will not be too impor- 
tant for our calculations below. A moderately realistic 
choice is p(°'{h) oc l/h 3 with a cutoff at some hi <C 1. 
Note however that the choice p'°) (h) = 1 corresponding 
to p(°}{h) oc h 1 ^^ 1 is very unrealistic. Below we shall 
assume that p(°\h) i s ind ependent of Af. 

We next write Eq. ( |9.l[ ) in a more explicit form. With- 
out loss of generality we can take 



27 



( S 1 ,..., S ^) = ( S ,0 ) ...,0), 



(B3) 



where s = p(s) = ||s||. Then, writing (s|h) = shcost 
and using the useful identity 



/>oo 

/total [p(s),N] = - / dhp^(h) Gx\p(a)h] 
Jo 



log 2 



V^T[(JV -l)/2] 



T[(JV -l)/2] 
we can write 



sin(0) 



N-2 1M-1 



dddh, 



(B4) 



p(h | s) d M h = ACi p {0) (h) sin^)^" 2 
1 

x exp 



where 



and 



(B13) 



(B14) 



2shcos9) 



where /Ci is a constant. If we define the function F_\r(x) 
by 



dhffl, (B5) p {1 \h) = 2JC 1 p^(h) e -(p(-) 2 +^ 2 )/2 F N [p{s)h]. (B15) 

Equations @, @, and (p!3]) - pl^ ) now define ex- 
plicitly the total information /total as a function of the 
parameters p(s) and AT and of the prior PDF p(°\h). 



d6 sm{6) 



X 2 ^xcost 



(B6) 



2. Approximate formula for the total information 



then the constant ICi is determined by the normalization 
condition 



I = 2/Ci 



dhe- (s2+h2 ^ 2 Fj^(sh)p ( ^(h). (B7) 



We next calculate t he P DF p[h | p(s)] appearing in the 
denominator in Eq. (9.2). From Bayes's theorem, this 
PDF is given by 



p[h\p(s)]=tCp(°\h)p[p(s)\h}, 



(B8) 



where p[p(s) | h] is the PDF for p(s) given that the grav- 
itational wave signal is h, and K, is a normalization con- 
stant. Using the fact that p(s | h) oc exp [— (s — h) 2 ] , we 
find using Eq. ([bJ) that 



p{s | h) d M s = 



2l-A^/2 



V^T [{Af - l)/2] 
I 



sin(0> 



JV-2 gJV-1 



We now derive the approximate formula ( J9.8| ) for the 
total information. Let pi = p(s) 2 /Af; we will consider the 
limit of large p(s) and Af but fixed pb- Our analysis will 
divide into two cases, depending on whether p\, > 1 or 
Pb < 1- Let us first consider the case pb > 1. In the large 
Af limit the result for pb > I will be independent of the 
prior PDF p (0) (h), which we assume has no dependence 
on TV. 

The first term in Eq. ( |BI3| ) is the expected valu e 
(GV[p(s)/i]) oiGx[p(s)h] with respect to the PDF ( |B15| ). 
If we change the variable of integration in this term from 
h to u = h/ VAf, we find 

/>oo 

(GV[p(s)/i]> oc / dup^iVAfu) 6 -^(p 2 +- 2 )/ 2 
Jo 

x Fj s /{N pbu) G u (Np b u). (B16) 



x exp 



(s 2 + h 2 - 2shcos( 



Integrating over now yields from Eq. (B6) 



From Eq. (B6) it is straightforward to show that in the 
dsd9. (B9) limit of large AA, 

I 



F N {Mz) « t^c) 



/ 2vr 



p[p(s) = s | h] ds oc s 



A/"-l„-0 2 +/i 2 )/2 



F N (sh)ds. (BfO) 



Now combining Eqs. (B4), (B8), and (BIO) yields 

p[h | p(s)] d^ft = fC 2 P {0) (h) e -(p(«) 2 +'' 2 )/2 FjvKsJft] 

x sin(6»)^ 2 dhd6, (Bfl) 



for fixed z. Here <7(0) is the function 
q(9) — z cos 9 + In sin 6, 



(BI7) 



(B18) 



and C = 9 c {z) is the value of 9 which maximizes the 
function q{9), given implicitly by 



where from Eq. (B7) the normalization constant is given 
by 



Kn = 



2r(/v/2) 



0FT[(/V-l)/2] 



(B12) 



z sin t/ c = 
We similarly find that 



COS t 



2tt 



We can now calculate the information /total by combin- 
ing Eqs. (|J2|), @, @), (|BT|), and ( pl^ ). The result 
is 



W(0 C )| 



COS C . 



(B19) 



(B20) 



I t is legitimate to us e th e approximations (B17) and 
(B20) in the integral (BIC) since the value u ma , x (Af, pb) 
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of u at which the PDF (jVp&tt) is a maximum ap- 
proaches at large Af a constant u max (pb) which is inde- 
pendent of A/", as we show below. 



Pi 



(B29) 



Inserting the approximation ( B17 ) into E q. (B16 ) and 
identifying z = pbU, we find that the PDF (B15) is pro- 
portional to 



exp[/VQ(u) + 0(l)] 



where 



Q(u) 



-~(pl+u 2 )+q(e c ) 



(B21) 



(B22) 



where the last term denotes the rms magnitude of the 
statistical fluctuations. Since we are assuming that pb < 
1, it follows that pi 1 — 1/V^bins, and therefore we 
obtain from Eq. (B28) that 



'total 



1 . 

—mm 

2 



(B30) 



and 9 C = 9 c (z) = e c (p b u). From Eqs. ( p!8| ) and ( pl9| ) it 
can be shown that the function (B22) has a local maxi- 
mum at 



- . /„2 



(B23) 



Thus, if /iprior ^ 1, then the total information gain is ^ 1 
also. 



3. Approximate formula for the source information 

We now turn to a discussion of t he approximate for- 
mula ( |9.12 ) for the information I 



11| ) obtained about the 



at which point 9 C is given by sin# c = 1/pb- The form of 
the PDF (p2l|) now shows that at large TV, 



(B24) 



max J ■ 



Finally, if we combine Eqs. flB13| ), (|B1?1) (|B20|), (|B23| ) 

and (B24) and use Stirling's form ula t o approximate the 
Gamma functions, we obtain Eq. (9.3). 

Turn, next, to the case pb < 1. In this case the func- 
tion Q does not have a local maxi mum, and the dominant 
contribution to the integral ( B16| ) at large Af comes from 
h ~ O(l) (rather than from h ~ VW, u ~ 0(1) as was 
the case above). From Eq. ( |B6| ) we obtain the approxi- 
mations 



and 

F^(VJ7w) -- 




source of the gravit ation al waves. In general, the mea- 
sure of information ( |9.11 ) depends in a complex way on 
the prior PDF p(°\h), and on how the waveform h(9) 
depends on the source parameters 6. We can evaluate the 
information /source explicitly in the simple and unrealistic 
model where the dependence on the source parameters 6 
is linear and where there is little prior information. In 
this case the manifold of possible signals is a linear sub- 
space (with dimension A/" pa ram) of the linear space of all 
possible signals (which has dimension TV"). The integral 
( |9. lip then reduces to an integral analogous to (9.2), and 
we obtain t he f ormula (9.12) in t he same way as we ob- 
tained Eq. (|9.(j). The result ( 9.12 ) is clearly a very crude 
approximation, as the true manifold of merger signals is 
very curved and nonlinear. Nevertheless, it seems likely 
that the formula ( 9.12j ) will be valid for some effective 



1 + 0(l/vAf) (B25) number of parameters TVparam that is not too much dif- 
ferent from the true number of parameters on which the 
waveform depends. 



7T W „„2 



w 2 /2 



2 Af 



1 + 0(1 /VAT) , (B26) 



APPENDIX C: EXPECTED VALUE OF 
CORRELATION COEFFICIENT 



which are va lid fo r fix ed w at large Af. Using Eqs. ( B25 ), 
flggg ), and ( jB13| ) - ( gig ), and using Stirling's formula 
again we find that 



1 2 C^'Wexp [-(l-pg)/^] h 2 

total f™ dh p(0 ){h)exp[ _ {1 _ p 2 )h 2 /2] ■ 

(B27) 

For simplicity we now take p^ (h) to be a Gaussian cen- 
tered at zero with width h 2 lim ; this yields 



In this appendix we derive the formula ( 5.18|) f or the 
expected value of the correlation coefficient (5.17). We 
start by deriving the following general result. Let n = 



(n\. 



be a Gaussian random variable with (n) 



and (n l ni) = £«. Let h = (h 1 , . . . , h N ) be a fixed 
vector, and define the random variable C by 



C 



h- XT 1 • (h + n) 



Vh • S- 1 ■ h^(h + n) ■ E- 1 ■ (h + n) 



(CI) 



-^total ~ 2 



2 h 2 
prior 



(B28) 



Then, in the regime Af 3> 1 and p ^S> 1, where p 2 = 
h • ■ h, we have 



From Eq. ( |9.7| ), the parameter pb is given by 



(C) 



y/l+Mlp 2 



l + O 



1 1 



(C2) 
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Equation ( |5.18|) c an be obtained from Eq. (C2) as fol- 



lows. From Eq. 



the vector hbcst-fit can be written 



Ibcst-fit 



(C3) 



where ri|| denotes the component of the noise n in the 
space U. T hus, the vectors h and hb es t-fit which ap- 
pear in Eq. ( [5.17| ) both lie in the space U of dimension 
■A/bins (although both nominally lie in the larger space V 
of dimension A/" bins ). N ow, i dentif ying N and A/bins, we 
see tha t the quantities ( 5.17 ) and (CI) coincide, and the 
result (5.18) follows. 

We now now turn to the derivation of Eq. (|C2|). First, 
make a linear change of variables to make Ey = <5j,-. 
(The results obtained at the end can be generalized to 
non-unit S by inspection.) We want to evaluate 



(C) 



dn l 



dn M p(n)C(n), 



(C4) 



where C(n) is given by Eq. (CI). The quantity C(n) 
depends on n only through the combinations 



h • n = h 



and 



N 



= n • n = (? 



i\2 



i=l 



Hence 



where 



(C) 



da I d/3 p(a,@)C(a,P), 



p y / p 2 + 2a + f3' 



(C5) 



(C6) 



(C7) 



(C8) 



The probability distribution p(a, 0) can be approxi- 
mately evaluated in the following way. We have 



p(a,P) = p(f3\a)p(a). 



(C9) 



Here p{j3\a) is the dist ribu tion for (3 given a value of a 1 
and p{a) is from Eq. (C5) a Gaussian with zero mean 
and variance p 2 : 



p(a) 



1 



'2-Kp 

We introduce the notation 



exp{- a 2 /(2p 2 )} 



■P(f3\a)d(3, 



(CIO) 



(Cll) 



and define (A{3) 2 a = ((3 2 ) a - ((3) 2 a . The distribution 
p(/3\a) can be treated as being approximately Gaussian 



in the regime where (A/3) Q ((3) a , which we show be- 
low is the case when p 2 ^> 1 and Af 3> 1. Hence we need 
only evaluate {0) a and (A/3) Q . 

Without loss of generality we can write h = 
(p, 0, . . . ,0), so that from Eq. (C5), a — pn 1 . Similarly 
Eq. © gives 



(C12) 



i=2 



Using the fact that the n % are independent nor mall y dis- 
tributed random variables, it follows from Eq. ( |C12j ) that 



pi 

and similarly we find 

(A/3) 2 =2(JV-l). 



(C13) 



(C14) 



Now the integral (C7) will be dominated by contributions 
from the regime a J$ (a few) x p. In this regime, we have 



(Ag)q 



< 1, 



(C15) 



which justifies our treating the PDF p(j3\a) as Gaussian. 
Combining these results we find 



p{a,(5) 



1 



27ra(A/3) Q 



exp ■ 



a 2 ((3-(P) a ) 2 



2a 2 2(A/3)2 



(C16) 



Inserting this distribution into Eq. (C7), using Eq. ( JCq ) 
and expanding to second order in a gives 



(C)= / dap(a)C[a,(p) a ] [1 + 0{1/N)] 

da p(a) — ^ = = 

v /(l + a//9 2 ) 2 + (AA _ 1)/p 2 

x [1 + O(l/A0] 



(C17) 



as required. 
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When aliasing effects can be neglected, it can be shown 
using the method of appendix O that the expected value 
(C) of this correlation coefficient is approximately given 
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by Eq. (5.18), but with pbin now given by 
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However, we neglect this effect here as it is not possible to 
calculate it in the absence of a model waveform h(0) for 
merger waves. For inspiral waves, relatively crude tem- 
plates will suffice to detect the waves, and more accurate 
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requirement "have the same magnitude as the measured 
signal" to the requirement "have a magnitude which is 
smaller than or equal to that of the measured signal"; 
the numerical value of the information would be approx- 
imately the same. The reason we insert this requirement 
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